Statistics from Altmetric.com
This article describes a protocol for a multicentre, blinded, randomised placebo surgery controlled trial, using a parallel two-arm design to investigate the efficacy of arthroscopic partial meniscectomy (APM) in patients with degenerative meniscus injury.
We will also describe a novel ‘RCT within-a-cohort’ study design that aims to improve the generalisability of randomised controlled trials (RCTs) in general.
The immediate goal of this trial is to provide evidence on the ‘true’ efficacy of APM, the most common orthopaedic procedure.
Controlling for the placebo effect is a critical aspect of experimental design in any clinical research, particularly when assessing the success of treatment of degenerative orthopaedic complaints. Optimally, it requires the use of placebo surgery.
The ‘RCT within-a-cohort’ study design will mark the first undertaking of an orthopaedic trial to address the inevitable trade-off between internal and external validity. The rigorous design, organisation and execution of this trial aim to set a new standard for future clinical trials in orthopaedics.
Strengths and limitations of this study
Our protocol paper will present a detailed ethical analysis of the use of placebo surgery, a framework that future placebo surgery controlled trials in orthopaedics are suggested to follow.
The rationale supporting placebo surgery controlled design includes demonstrated considerable ‘placebo effect’ in surgical trials and optimal blinding of both the patient and the physician-researcher.
The study is limited to degenerative meniscus injury. The definition of the concepts ‘degenerative’ or ‘traumatic’ in the context of meniscal injuries is arbitrary in nature.
This is a protocol paper; the trial is underway, but has not been completed.
Middle-aged men and women with knee pain attributed to degenerative knee disease (a degenerative meniscal injury accompanied by varying degrees of knee osteoarthritis (OA)) constitute the largest group of patients referred to orthopaedic surgeons.1–4 The prevailing understanding regarding the aetiology and treatment of degenerative knee disease is quite mechanical in nature: knee pain perceived by the patient is attributed to a degenerative meniscus injury or chondral derangement. The symptoms related to degenerative meniscus injury are commonly chronic and fluctuating in nature. However, somewhat paradoxically, a recent population-based MRI study showed degenerative meniscal tears to be highly prevalent in middle-aged and elderly adults and they seem to be asymptomatic in the majority of individuals.5 ,6 The current treatment strategy of early degenerative knee disease includes rest, oral non-steroidal anti-inflammatory drugs, glucocorticosteroid injections, physiotherapy and ultimately arthroscopic removal of mechanical derangement (arthroscopic partial meniscectomy (APM) or debridement).7 In the USA alone, one million knee arthroscopies are performed annually to treat degenerative knee disease, making arthroscopic surgery of the knee by far the most common orthopaedic procedure.8 ,9 Many patients report improvement (reduced knee pain, better function and improved quality of life) after this kind of surgery.10–13 However, similar results have also been obtained with conservative treatment in randomised, non-placebo-controlled trials.14–16 To date, there is a dire lack of high-quality evidence (from randomised controlled trials, RCT) on the efficacy of APM.17 This issue cannot be addressed simply by evaluating the outcomes of patients who have undergone surgery, as the role of the underlying degenerative process and the surgical procedure cannot be unravelled in such a study design.18 ,19
It is widely agreed that controlling for the placebo effect is a critical aspect of experimental design in any clinical research.20 ,21 A comprehensive meta-analysis of the ‘placebo effect’ indicated that subjective outcomes may be attributable to ‘placebo effects’ or bias.20 As subjective measures or patient-reported outcomes—such as the amount of pain or perceived disability—are often used and also widely recommended as primary outcomes in assessing the success of treatment of degenerative orthopaedic complaints,22 ,23 the issue of the possible ‘placebo effects’ of surgery is of utmost relevance. This is an obvious argument favouring the use of sham surgery to control for the imminent bias. Optimal blinding of both the patient and the physician-researcher is another clear asset of the sham-surgical model: patients are biased towards favourable outcomes after surgical intervention because they want to believe that they chose the correct option for their care. This ‘leap of faith’ is believed to be greater in surgery than in medical trials, in which the perceived and real risks of the intervention may be more subtle, less severe and do not involve the pain and risks of invasive procedures.24 Equally importantly, sham surgery is ideal for minimising the investigator-bias through true blinding of the outcome assessor.25
Although the first-ever sham surgery controlled clinical trial was published as early as in 1959,26 there has been a paucity of placebo surgery controlled trials in orthopaedics.27 ,28 Even discussion regarding the use of the sham surgery model has been limited and generally condemning or censorious.27 ,29 ,30 The essential question regarding the ethics of the use of placebo surgery concerns the dilemma between the provision of the highest standard of research design (the double-blind, randomised, placebo-controlled trial) and the highest standard of ethics (to do no harm to the patient).29 When evaluating the merits of a study protocol, the institutional review board (IRB) should consider not only the risks and benefits that may result from the research but also the possible long-term effects of applying knowledge gained in the research as among the research risks that fall within the purview of its responsibility.21 ,27 Dowrick and Bhandari recently proposed that there are four areas that deserve particular attention when deciding on the appropriateness of a sham surgery controlled RCT, namely the equipoise, risk minimisation, informed consent and deception. As our views on the last three issues are thoroughly expressed in the ethical annex (box 1), we will focus here on equipoise. In short, the statement by Dowrick and Bhandari21 that “In orthopaedics, the reality is that many treatments are ‘known,’ ‘accepted,’ or ‘standard’ but may not yet be ‘proven’ in the true sense of the word as it applies to scientific research” captures the essence of equipoise and APM.
Ethical considerations of using placebo surgery control, as outlined by Miller31
I. Is there scientific and clinical value in conducting this study?
Despite being the most common orthopaedic procedure, there exists no randomised, placebo-controlled study on the efficacy (risk-benefit ratio) of arthroscopic partial meniscectomy (APM). We would like to emphasise that a properly conducted efficacy trial on APM could provide better evidence-base for this procedure and theoretically eliminate unnecessary procedures or at least provide the field with prognostic factors for good/poor outcome.
II. Is the use of placebo surgery methodologically necessary or desirable to achieve valid results?
Given on the one hand the subjective and somewhat vague nature of the symptoms related to degenerative meniscus injury (fluctuating course of symptoms that range from one individual to another from spontaneous disappearance of the symptoms to quite dramatic impairment in the quality of life), and on the other, the well-known strong placebo effect of surgical interventions,27 ,32 we feel confident in stating that there is indeed no alternative to using placebo surgery as a control.
III–V. Risk–benefit assessment
A basic ethical requirement of clinical research is to minimise risks.33 This does not mean that risks can be reduced to zero or that risks must be minimal.31 The risks inherent in our placebo surgery procedure are related to spinal anaesthesia and arthroscopy. Spinal anaesthesia does carry a modest risk (mainly of postspinal headache), but similar risks have been considered acceptable for other medical investigators, such as spinal tap for neurological diseases or when used to carry out muscle biopsies for neurological or orthopaedic study designs or bronchoscopy.34 As for the risks related to knee arthroscopy per se, the most common are wound infection, deep venous thrombosis and pulmonary embolism with respective incidences of 0.22%, 0.12% and 0.08%.35 There is naturally also a risk for iatrogenic lesions (chondral surfaces) inside the knee during the procedure, but given the fact that all surgeons performing the operations in the RCT are highly skilled with experience of hundreds of previous knee arthroscopies, these are considered negligible. In addition, knee arthroscopy is commonly used as a diagnostic procedure for patients with knee pain, and accordingly the patients allocated to the placebo surgery arm can be considered to gain by being assured that no pathology other than a degenerative meniscus injury exists in their knee joint. The other ethical dilemma regarding our procedure actually lies at the heart of our study design, the potential harms and benefits related to the resection of the torn meniscus. The existing evidence shows quite convincingly no benefit of knee arthroscopy in the prevention of the degenerative process (knee osteoarthritis (OA)),36–38 and accordingly few would argue that the goal of APM in patients with degenerative meniscus injury is solely to relieve symptoms. On this note, there is naturally a risk that patients allocated to the sham arm will suffer from the delay in getting the potentially beneficial procedure, a risk that we have attempted to minimise by allowing patients to request a reassessment of their symptoms (and potentially cross over to the APM arm) after 6 months of follow-up. It should also be noted that all the risks discussed above are related to the current treatment strategy of degenerative meniscus injury, the resection of the meniscus and the recent evidence actually implies that meniscectomy is associated with an increased risk of future knee OA.39 ,40 Finally, we contend that there are no objective tools for measuring risk-benefit ratios in research. As Gillett has suggested: ‘an acceptable procedure for the placebo arm of a surgical trial is one that carries no more risk than the investigation that would be used to diagnosis and treatment’.41 We argue that the relatively minor risks related to placebo procedure were justifiable to answer the clinically important question of whether arthroscopic surgery is effective in the treatment of degenerative meniscus injury.
VI. Do the individuals give informed consent?
The ethical issues of the trial will be thoroughly explained and discussed with each recruited patient, both verbally and in writing. The basic principles laid down in the Declaration of Helsinki will be followed throughout the execution of the trial. Accordingly, each participant has the right to withdraw from the study at any given moment without having to explain this decision in any way. Finally, the possibility of being randomised to placebo surgery will be reiterated to each participant before the actual randomisation.
RCT is considered the gold standard of research design in terms of methodological rigour (internal validity). Ideally, a well-designed RCT should not only have high internal validity but also preferably high external validity (generalisability).25 However, realistically, such a ‘wish’ is an obvious paradox, as there is an almost inevitable trade-off between internal and external validity. In essence, the purpose of a true efficacy (or explanatory) trial is to demonstrate that an intervention can work theoretically under optimal conditions (‘best-case scenario’). The patients recruited are thus carefully selected to obtain as homogeneous a sample as possible and they are also treated under ideal conditions (eg, by the most skilled/experienced physicians). An effectiveness (or pragmatic) trial, in turn, is aimed at testing how an intervention works under usual practice circumstances, and for this reason has a high external validity, but the internal validity is usually lower.25 ,42 ,43 Relton et al44 recently described a novel ‘cohort multiple randomised controlled trial’ design, in which the benefits of a large observational cohort and a pragmatic RCT can be combined. However, this pragmatic design cannot be used for trials with a placebo comparator (ie, ‘true’ efficacy design) as the patients may not receive placebo in routine healthcare.
In this paper, we describe the protocol of our ongoing placebo surgery controlled trial assessing the efficacy of APM. To address some of the problems related to both efficacy and effectiveness RCTs, we also briefly introduce a novel ‘RCT within-a-cohort design’. This protocol paper has been written according to the recently published SPIRIT guidelines.45 Hopefully, the publication of this article will make our study objectives and methods more transparent and consequently improve the eventual utility of our study.46
Methods and analysis
We are conducting a parallel (1 : 1), multicentre, randomised and placebo surgery controlled superiority trial to assess the efficacy of APM for patients with a degenerative medial meniscus injury (figure 1). Patients, outcome assessors and data analysts (before the final writing of the manuscript) will be unaware of treatment assignments (triple blind approach).
All patients have been given a written informed consent form and have been made to realise that on entering the study they may receive only placebo surgery, in which case the meniscus tear will be left untreated. They also know that participation in the study is entirely voluntary and the decision they make will not affect their possible future care (in case of refusal). In addition, every patient has been informed of their right to withdraw from the trial whenever they desire without giving the researchers any reason for their decision.
We have enrolled patients at the outpatient clinics of five secondary or tertiary hospitals in Finland. The study was first launched at one site, but to improve recruitment after 2 years (and to ensure multicentre design with its obvious benefits to the generalisability of the results), the study was expanded to four additional sites. The sites were selected on the basis of having an established practice of knee arthroscopy and also an experienced arthroscopic knee surgeon keen to participate in the study. All orthopaedic surgeons involved in the actual interventions of the study are members of the Finnish Arthroscopy Association and have experience of more than 1300 (range 1300–2500) knee arthroscopies with an annual rate of at least 200 (range 200–350) arthroscopies. Before the actual launch of the study at the four additional study centres, the protocol was discussed in the minutest details to ensure that it would be adhered to in a similar manner in all five centres. The principal clinical investigator and the study coordinator also visited every centre at least once before the study was begun at these selected sites. Further, regular meetings have been held biannually among all study members to ensure that the protocol is carried out as planned.
Patients referred to any of the five orthopaedic surgeons and complaining of knee pain consistent with a meniscus injury were assessed for eligibility. Eligible patients were 35–65 years of age with persisting symptoms (more than 3 months) consistent with a degenerative medial meniscus injury and unresponsive to the appropriate conservative treatment. Surgeons performed a detailed examination of the knee, documenting the range of motion, the presence of an effusion and the results of the meniscal and stability tests. The meniscus injury had to be symptomatic (joint line tenderness, pain in forced flexion or a positive McMurray test). Patients with clinical (American College of Rheumatology) criteria47 or radiographic knee OA were excluded. Radiographic exclusion criteria included knee OA (grade 2–4) according to the Kellgren and Lawrence scale48 depicting definite joint space narrowing or definite osteophytes. Other exclusion criteria were a history of obvious traumatic onset of the symptoms (symptoms resulting from kneeling, twisting or slipping without an actual fall were not considered ‘traumatic’ despite a possible sudden onset), recent episodes of locked knee (the knee cannot be fully extended), previous surgical treatment of the knee, instability or restricted range of motion, previous (within 1 year) fracture of the index extremity and findings on MRI or arthroscopy of any other pathology than a degenerative meniscus injury (requiring an intervention other than APM; the final assessment of chondral lesions was carried out arthroscopically and, if deemed osteoarthritic, they were left untreated). Patients unable to provide informed consent were also excluded. All inclusion and exclusion criteria are listed in box 2.
Inclusion and exclusion criteria used in the randomised controlled trial
Age: 35–65 years of age
Persistent (>3 months) pain on the medial joint line of the knee
Pain provoked by palpation or compression of the joint line or a positive McMurray sign
MRI showing signals characteristic of medial meniscus injury
Degenerative injury to the medial meniscus confirmed at arthroscopy
Trauma-induced onset of symptoms
Locked knee (that cannot be straightened normally)
Previous surgical procedure on the affected knee
Clinical osteoarthritis (OA) of the knee (American College of Rheumatology criteria)47
Radiological OA of the knee (Kellgren-Lawrence grade >1)48
Acute (within the previous year) fracture of the affected extremity
Decreased range of motion of the knee
Instability of the knee
MRI assessment shows pathology other than degenerative knee disease requiring treatment other than arthroscopic partial meniscectomy (APM)
Arthroscopic examination reveals pathology other than a degenerative injury to the medial meniscus requiring intervention other than APM
Baseline assessment included sex, birth date, height and weight, mechanism of injury (onset of symptoms), time from the onset of symptoms and standard clinical examination. After being deemed eligible for inclusion, a standardised conservative treatment protocol (exercise programme) was administered by a physiotherapist to all patients who were also given written instructions to ensure the adequacy of conservative treatment before the operation.
Radiographic and MRI
Weight-bearing posteroanterior knee radiography with the use of a fixed-flexion protocol (knees in 20° of flexion and the beam oriented 10° above the horizontal axis)49 ,50 and a whole leg radiograph for assessing the possible malalignment was performed for all individuals. MRI scans were obtained with the use of a 1–3-Tesla scanner with a phase-array knee coil. An increased meniscal signal communicating with the inferior, superior or free edge of the meniscal surface seen on two consecutive slices was considered indicative of a meniscal injury.51 The x-rays and MRI were assessed by both a musculoskeletal radiologist and an orthopaedic surgeon and the final grading was carried out by consensus.
After the baseline visit, the eligible patients were placed on a waiting list for surgery to be operated on within the subsequent 3 months. If the symptoms subsided between the baseline assessment and the day of surgery, the patient was excluded from the trial. These patients, along with those who met the eligibility criteria but declined to participate in the RCT, were asked to give permission for follow-up with a protocol identical to the RCT participants.
Arthroscopic examination of the knee was first performed for all patients using standard anterolateral and anteromedial portals with a 4 mm arthroscope (under spinal or general anaesthesia (one site) according to the normal practice of the respective hospitals). The orthopaedic surgeon evaluated the medial, lateral and patellofemoral joint compartments and graded the articular lesions according to the ICRS52 classification and the meniscus injury(-ies) according to the classification by Cooper et al.53 Patients with obvious chondral flaps or other chondral injuries indicative of trauma, loose bodies or any finding other than a degenerative medial meniscus injury (requiring an arthroscopic procedure) were excluded from the trial. After a thorough arthroscopic examination of the knee (confirming the eligibility of the patient for the RCT), the patients were randomly assigned to receive APM or placebo surgery. To ensure the blinding of the patients at the four study sites using spinal anaesthesia, the blinding of the patient was further ensured by shielding the patients’ view with a vertical drape and aiming the arthroscopy monitors away from the patient's line of vision.
Randomisation and concealment
To enter a patient into the study, a research/staff nurse opened an envelope containing the treatment assignment and revealed it to the surgeon by showing the paper, but the allocation was not expressed verbally. The sequentially numbered, opaque, sealed envelopes were prepared by a statistician with no clinical involvement in the execution of the trial using a computer-generated schedule and the envelopes were kept in a secure, agreed location at each centre. To minimise the risk of predicting the treatment assignment of the next eligible patient (to ensure concealment), randomisation was performed in unfixed blocks (block size known only to the statistician). A randomisation sequence of fourfold stratification, including the study site, the age (35–50 or 51–65 years of age), the gender (female or male) and the existence of radiographic knee OA (no (=K/L 0) or minor degenerative changes (=K/L 1))48 was used.
Arthroscopic partial meniscectomy
The damaged and loose part of the meniscus tissue was removed with arthroscopic instruments (mechanised shaver and meniscal punches) until solid meniscus tissue was reached. The meniscus was then propped to ensure that all loose and weak fragments and unstable meniscus tissue had been successfully resected, preserving as much of the meniscus tissue as possible.54 No other surgical procedure (synovectomy, debridement, excision of fragments of articular cartilage or chondral flaps, or abrasion and/or microfracture of chondral defects) was performed.
A standard APM procedure was simulated. The surgeon asked for all instruments and manipulated the knee as if APM was being performed. The mechanised shaver (without the blade) was also pushed firmly against the patella, outside of the knee, to mimic as closely as possible the feelings and sounds of the normal use of the arthroscopic shaver. Further, to simulate the sounds of normal APM, suction was also used to drain the joint and saline was splashed. The patient was kept in the operating room for the amount of time required to perform an actual APM.
All procedures were standardised across sites. Furthermore, all arthroscopies were video-recorded for later analysis and to ensure the comparability of all interventions.
In both APM and placebo surgery groups, the postoperative care was delivered according to a standard protocol specifying that all patients received the same walking aids and graduated exercise programme. Analgesia was given according to standard practice. All patients were discharged after full postoperative recovery. The staff delivering the care was blind to the treatment allocation. The notes specifying the actual procedure (APM or placebo) were sealed inside an envelope, but the arthroscopic findings and a phrase ‘a procedure according to the degenerative meniscus trial was performed’ were included in the patients’ medical records.
Prospective cohort (for the ‘RCT within-a-cohort’ design)
Additionally, in one of the five study centres (the initial site in which over half of the participating patients were enrolled), all patients scheduled for a knee arthroscopy were followed up prospectively with a follow-up assessment protocol identical to that in the actual trial. Those with a degenerative meniscus injury (but who had somehow bypassed our recruitment, were treated by an orthopaedic surgeon not participating in the trial or were excluded from the trial) form a non-randomised, pragmatic, prospective cohort. This pragmatic cohort, together with the patients treated in the actual RCT (those randomised and also those who declined to be randomised), constitutes the ‘RCT within-a-cohort’ study population (figure 2A).
The symptoms currently believed to be attributable to a torn meniscus are mostly subjective (pain, discomfort and possible functional deficits) and fluctuating in nature, and accordingly our outcomes are patient-administered outcome measures. The follow-up scheme of this trial consists of four questionnaires (one condition-specific outcome measure, one pain measure, one disease-specific quality of life index and one generic health-related quality of life (HRQoL) index), which are administered preoperatively (on the day of surgery), and then again at 2, 6 and 12 months postoperatively.
Postoperative questionnaires were/are mailed to the patients with an enclosed prepaid return envelope. In case of non-response, the study coordinator contacts these participants by phone, requesting the return of the follow-up questionnaire. At the 12-month follow-up point, the patients are also asked to respond to the following questions: (1) which procedure (APM or placebo surgery) do they think they have undergone, (2) is the knee better than before the intervention, (3) are they satisfied with their knee at present and (4) would they choose to be operated on again if they had to make the decision now. A flowchart of the follow-up scheme is presented in figure 3.
Lysholm knee score
The Lysholm knee score was originally designed to evaluate knee function and symptoms in activities of daily living in patients with anterior cruciate ligament insufficiency.55 It has later been adjusted and validated for the evaluation of meniscal injuries.56 ,57 The Lysholm knee score is a condition-specific outcome measure with eight domains: limp, locking, pain, stair climbing, use of supports, instability, swelling and squatting.58 An overall score of 0–100 points is calculated, 0 denoting the worst possible outcome and 100 the opposite.58
Numerical rating scale for pain
Knee pain at rest and after exercise (over the last week) will be assessed on a 0–10 numerical rating scale, (NRS) comprising ‘no pain’ at the left (0) and ‘extreme pain’ at the right (10). NRS is easy to administer and has been validated as a measure of pain intensity in populations with known pain.59
Western Ontario Meniscal Evaluation Tool
The Western Ontario Meniscal Evaluation Tool (WOMET) is a disease-specific tool designed to evaluate HRQoL in patients with meniscal pathology.60 It has been developed to be the measure including the most important items for patients with meniscal tear61 and was recently validated for patients with degenerative meniscus injury.62 WOMET contains 16 items addressing three domains: nine items for domains of physical symptoms, four for domains of disabilities due to sports, recreation, work and lifestyle and three for the emotions domain.60 Each of the WOMET items is given a visual analogue scale (100 mm lines anchored at the ends). The best or least symptomatic situation ranks 0 and the most symptomatic rank possible is 1600.60 For the sake of simplicity, the score is usually converted to a percentage of normal, in which zero (0) denotes the worst possible situation and 100 the best possible situation.
The 15D instrument is a generic HRQoL instrument comprising 15 dimensions.63 For each dimension, the respondent must choose one of the five levels that best describes his/her state of health at the moment (the best level being 1 and the worst level being 5). A set of utility or preference weights is used in an addition aggregate formula to generate a single index number, the utility or 15D score. The maximum 15D score is 1 (no problems on any dimension) and the minimum score is 0 (being dead). The responsiveness, reliability and validity of 15D have been thoroughly established, and this instrument has been used extensively in clinical and healthcare research.64 ,65
Initially, our two primary outcome measures were the Lysholm knee score55 and the NRS for pain (VAS) at 12 months after surgery. The secondary outcome was the WOMET60 score at the same time point and a cost-utility analysis (based on the patients’ general quality of life using the 15D score and the utilisation of healthcare resources). At this preregistration stage (2007), the Lysholm score was conceived of as a prespecified primary outcome as this tool has been validated for meniscus injury56 and is also the outcome instrument used most often for various knee conditions.58 ,66 However, during the recruitment phase of the trial, we showed that WOMET has acceptable psychometric properties for the evaluation of patients with a degenerative meniscus injury and early OA,62 and accordingly have now decided to use it as our third primary outcome.
Patients were told at the time of giving their consent that they would be allowed to consider crossing over to the other procedure 6 months or later after the arthroscopic procedure if adequate relief of symptoms was not achieved (based on the existing evidence showing that it can take up to 6 months postoperatively to obtain the actual benefit from the APM).67 ,68 No specific numerical thresholds of outcome measures were set for the option for crossover, but the patients will be asked in the 6-month follow-up questionnaire about their willingness to undergo reoperation based on their current symptoms. If a patient reports symptoms requiring attention at 6-month follow-up or becomes symptomatic after that, an appointment will be scheduled with an orthopaedic surgeon both blind to the initial treatment and not involved in the care of the patient before the appointment. If the clinical examination carried out at this point reveals clinical signs indicative of/consistent with a meniscal tear and the patient requests reoperation, the allocation will be unsealed and the patient will be offered APM. The number of patients both requiring additional follow-up appointments and unsealing of the allocation is registered in both the APM and placebo groups. MRI is not automatically performed for symptomatic patients because of its low specificity in differentiating residual or recurrent symptoms from the signal due to postresection (APM).69 However, in case of a new trauma to the index knee or exceptionally severe symptoms, an MRI examination will also be carried out. In addition to the aforementioned predetermined assessment of the possible adverse events in the 6-month follow-up questionnaire, the patients are instructed to contact the study coordinator with any possible complaints related to the knee surgery at any given moment during the postoperative period. These contacts (events) will be registered.
Loss to follow-up
As in an earlier placebo surgery controlled trial,70 we will document the number and proportion of individuals eligible for and compliant with each follow-up. Individuals who die during the study (from causes unrelated to the study or procedure) will also be tabulated. If the proportion of individuals withdrawing from either arm is greater than the anticipated 20% at 1 year, an analysis of the demographic and prognostic characteristics will be made between the individuals who withdraw and those who remain in the study. For continuous variables, parametric or non-parametric analysis of variance will be used. For categorical variables, χ2 or Fisher’s exact test will be applied.
When individual items are missing from a scale, we will calculate the per cent of missing items.
If the missing items are less than 10%, we will impute values using the mean of the remaining items. If more than 10% of the scale score is unavailable for analysis, the patient will be excluded from the analysis at that time point. For the WOMET score, our approach to missing values is slightly different: If a patient fails to complete 1–3 items, we will substitute the missing value(s) with the average value for the answered items according to the protocol described previously for the Western Ontario and McMaster Universities Osteoarthritis Index, a similar outcome tool used for the assessment of established knee OA.71 ,72
For the sample size calculation, we used our prospectively collected database of 377 consecutive patients with a degenerative meniscus injury who had undergone APM (data not published). Minimal clinically important improvement (MCII) was assessed with an anchor-based method (patient's global impression of change used as an external anchor).73 MCII was found to be 11.5 for the Lysholm knee score and 15.5 for the WOMET score. SDs for the response (postoperative score—preoperative score) of the same population were 21.1 and 24.5 for the Lysholm and WOMET scores, respectively. For the determination of the MCII for pain, we extrapolated our estimates from the existing literature on pain-VAS. Patients with knee OA report the MCII for pain to be 19.9 mm74 and in our own cohort, the SD of response of pain after exercise was 31.4 mm. Using these estimates, the required sample sizes were 40, 54 and 40 participants per group for the Lysholm score, WOMET score and pain assessment, respectively, with 80% power to show a clinically meaningful advantage of APM over placebo, based on a two-sided type 1 error rate of 5%. Given an anticipated dropout level of 20% and the possible problems related to uneven randomisation (multicentre design with sites as one stratum), we decided to recruit 70 patients per group.
For obvious ethical and safety reasons, an interim analysis by an independent data and safety monitoring board (the National Institute for Health and Welfare) was carried out after the enrolment of 70 (50%) patients to ensure that the rates of complications or rearthroscopies were within acceptable limits (within the normal rate of complications and/or reoperations related to knee arthroscopy). This analysis is planned to be carried out blind to the group assignment and unless a marked discrepancy is found in the incidence of complications/reoperations, the allocation concealment will be preserved. No other interim analysis is envisaged.
At each study site, the data are collected by a research nurse dedicated to the study. Forms are mailed to the coordinating centre (Hatanpää Hospital, Tampere), where all data are entered into a computerised database. After the completion of follow-up, all data are re-entered by an individual with no clinical involvement in the execution of the trial and then these two separate databases are compared for consistency. Participant files will be maintained in storage (both in electronic and paper format) at the coordinating centre for a period of 5 years after completion of the study.
Blinded data analysis
The Steering and Writing Committees (RS, MP, AM and TLNJ) will develop and record two interpretations of the results on the basis of a blinded review of the primary outcome data (treatment A compared with treatment B), with one assuming that A is the APM group and another assuming that A is the placebo surgery group. The Writing Committee will also deliberate before data analysis to determine the key analyses and presentation format for the primary publication. The minutes of this meeting are recorded as a statement of interpretation document, which will be signed by all members of the Steering and Writing Committees before the unsealing of the randomisation.75
The trial was designed to ascertain the superiority of APM over placebo at 12 months with primary outcome measures by the means of statistically significant differences between the groups. Baseline characteristics will be analysed by descriptive statistics. For the primary analysis, the change in scores (Lysholm and WOMET) and pain (NRS) at 12 months will be compared between the two study groups. A secondary analysis of the primary endpoint will adjust for those prerandomisation variables which might reasonably be expected to influence the outcome (ie, sex, age, radiological OA and baseline score). A two-sided p value of 0.05 will be considered to indicate statistical significance. We will use the Bonferroni method to appropriately adjust the overall level of significance for multiple (three) primary outcomes. Analyses of the outcome measures score will also be performed at 2 and 6 months. Additionally, the number and percentage of satisfied patients, those subjectively improved, and those whose allocation was unsealed will be calculated and compared between the two groups. All statistical analyses will be performed on an intention-to-treat basis, meaning that data from patients will be analysed according to the initial study-group assignment regardless of compliance with randomisation. However, as the potential for treatment conversion exists, but only from placebo surgery to APM, per-protocol analysis will also be carried out. We plan to conduct a subgroup analysis to compare the result based on the degree of radiographic OA (K&L 0 vs I). The serious adverse event rates occurring within 12 months of surgery for both study arms will be evaluated descriptively in aggregate and separately. If the number of events is large enough to allow more sophisticated analysis, the rates between the APM and placebo arms will be compared. The data will be presented by the total number of events and the total number of individuals with at least one event.70
IBM SPSS Statistics (V.20 or later, IBM Corporation) is used for all statistical analyses.
Ethics and dissemination
Ethical issues related to the use of placebo surgery
Our application to the Pirkanmaa Hospital District IRB contained a specific, six-point ethical analysis focusing on the methodological rationale for use of placebo surgery, risk-benefit assessment and informed consent as outlined by Miller (box 1).31
The findings of this study will be disseminated widely through peer-reviewed publications and conference presentations.
In this protocol paper, we describe the execution of a randomised, placebo surgery controlled trial for the assessment of the efficacy of APM in patients with a degenerative meniscus injury. Given the imminent risk for bias from an unblended trial of surgery versus medical (conservative) treatment—particularly acknowledging the potential of surgery to produce powerful placebo effects32 ,76 ,77—it is doubtful that a rigorous trial of surgery can be conducted without a sham surgery control when the primary outcome is pain, patient-reported improvement or quality of life.24 ,78 ,79 As the only previous placebo surgery controlled trial in knee arthroscopy32 prompted significant criticism from the orthopaedic discipline at large,15 ,80–91 we decided to use this debate as a template in rationalising our methodological choices (table 1).
Alleged poor generalisability was the most prevalent concern regarding the study by Moseley et al.15 ,80–83 ,85 ,87–91 Explanatory trials are designed to measure the efficacy of an intervention—to find out if the treatment exerts a biological effect in a research setting under ideal and controlled conditions. The factors most consistently found to predict a poor outcome after APM in patients with a degenerative meniscus injury are advanced knee OA and chondral damage.39 ,94 ,95 Also, lateral meniscectomy has been identified as a predictor of poorer prognosis rather than medial meniscectomy.39 ,96 Accordingly, to obtain a population with ‘best-possible response to APM’, we chose to recruit patients with a stable knee, no or minimal OA and an isolated, degenerative injury of the medial meniscus. Although age per se is not a predictor for poor outcome, it has been shown to be associated with increased prevalence of OA and (consequently) inferior functional recovery after APM.39 Similarly, women have also been found to have a less favourable prognosis after APM.94 These two factors were thus used as stratifications.
Acknowledging the obvious risk of high internal but low external validity due to the strict eligibility criteria, we decided to also keep track of the patients eliminated from the RCT by prospectively following these individuals with the same outcome assessment arsenal used in the RCT. One of the reasons for collecting this pragmatic cohort was to safeguard against bias attributable to patients declining to take part in a placebo surgery controlled trial. However, unlike in the trial by Moseley et al32 (only about 40% of the eligible patients agreed to participate), we have been very successful in convincing patients to participate (15% of eligible patients declined). This pragmatic cohort will also be used to corroborate or refute whether our efficacy (RCT) population represents the best-case scenario (those with optimal response to APM) (figure 2B).
Single versus multiple surgeons/CENTRES
The study by Moseley et al32 has also been criticised for having only one surgeon performing all operations.15 ,32 ,91 Similar to Moseley et al,32 we also began our trial in only one centre/surgeon, but once the study protocol had been thoroughly refined, it was expanded to four additional sites. The inclusion of five surgeons/study sites entails one problem: we chose to use ‘study site’ as our fourth stratification (separate randomisation sequences for each site), and accordingly, there is an imminent risk of uneven randomisation as the number of patients allocated at each of the four additional sites is relatively small (projected to be 10–20/site) and there are three stratifications in the randomisation. However, the use of separate randomisation sequences was a conscious choice, as we wanted to retain the possibility of analysing the data separately from each site.
By postponing the randomisation into the operation suite, we have managed to completely eliminate the chance that any eligible patient who gives informed consent would decline to participate in the trial after being randomised. Even though our ‘RCT within-a-cohort’ design allows us to also follow up those declining, the elimination of postrandomisation declining obviously decreases the risk of bias.
Outcome assessment and statistical analysis
According to the CONSORT statement,97 authors should not only define the prespecified outcome considered to be of the greatest importance to relevant stakeholders, but they should also indicate the prespecified time point of primary interest when outcomes are assessed at several time points after randomisation. Few would argue that the goal of treatment in patients with degenerative meniscus injury is anything but to relieve symptoms, particularly as the existing evidence quite convincingly shows that degenerative meniscus injury per se and APM both inevitably lead to the development of knee OA.36–38 As noted previously, it can take up to 6 months postoperatively to obtain the full benefit from the APM,67 ,68 but a more lasting relief of symptoms seems to be confounded by eventual progression of the underlying degenerative process. Accordingly, to be able to showcase the potential efficacy of APM on pain and the quality of life while minimising various types of confounding (eg, non-retention/loss to follow-up and progression of knee OA), we chose a 12-month time point as our time point of primary interest. However, the potential beneficial or detrimental short-term effects of either of the studied procedures (eg, possible transient accelerated recovery) can also be assessed using the data from the 2-month and 6-month follow-up assessments. Further, as Moseley et al32 were also criticised for reporting only average change of a group,86 we decided to ask our patients about their global impression of change with their knee status and their satisfaction with their knee. These two questions enable us to also report the outcome of our population as a proportion of patients who consider they have benefited from the treatment.
As for the prespecified outcome of the greatest importance to patients, Moseley et al32 developed a patient-administered outcome instrument of their own, a methodological choice that was criticised as the tool was not properly validated.15 ,80 ,83 ,86 We have chosen three validated outcome instruments in our trial. To safeguard against potential problems associated with multiplicity of analyses,97 our solution is twofold. We will use the Bonferroni method to appropriately adjust the overall level of significance and also take this issue into account in interpreting our findings: to be deemed hypothesis-proving, the intervention-induced changes have to be well aligned and clinically meaningful in all three validated primary outcomes.
Limitations of the study
The definition of the concept ‘degenerative’ or ‘traumatic’ in the context of meniscal injuries is arbitrary in nature. Traditionally, meniscal injuries or tears have been classified as traumatic or degenerative based on the aetiology (injury mechanism) or morphology (tear pattern observed in MRI or at arthroscopy), but no validated and generally agreed criteria exist to make the distinction between the two entities. Even the definition of the word ‘trauma’ represents a challenge. Although a ‘traumatic’ onset of symptoms is indeed an exclusion criterion in our RCT, the majority of patients with a ‘degenerative’ meniscus injury do experience some kind of twisting movement or other relatively modest injury prior to the onset of their knee symptoms.98 Accordingly, all patients with sudden injuries related to their own voluntary muscle activities (such as kneeling, bending or kicking) and patients with a minor twisting of the knee were included in our trial. In essence, our criteria labelling a tear as ‘traumatic’ describe a more substantial event, such as falling from a chair, stairs or bicycle or slipping on ice. To increase the validity of the classification of the meniscus injury, our assessment includes not only history/clinical examination but also MRI and arthroscopy. Finally, our RCT within-a-cohort design allows us to examine whether the response to APM of those excluded from the RCT because of ‘traumatic’ meniscal injury differs from those included in the RCT.
A similarly controversial issue is the differentiation of chondral injuries that require surgical procedure (debridement, microfracture or any other procedure) from those that should be considered merely osteoarthritic.99 To the best of our knowledge, there is no objective classification system (either MRI or arthroscopy based) for this distinction. Our rationale was to use preoperative MRI to verify that the patients’ symptoms could be due to degenerative meniscus injury (and to exclude possibly rare situations as knee tumours). No cartilage lesions detected in MRI were considered an exclusion, but the final decision whether or not to include the patient in the trial was made at arthroscopy, where the meniscus injury (and commonly associated chondral lesions) were visually and physically (probing) verified and assessed. As there really is no valid (RCT -based) evidence to show that surgery would provide any clinically meaningful improvement in patients with degenerative chondral lesions,15 ,32 we instructed our surgeons to generally forbear from surgical intervention. However, if a single large chondral flap or local deep osteochondral lesion was found, the surgeons were given permission to carry out the surgical procedure they deemed appropriate and the patient was excluded. Finally, the only difference between our two allocation groups is whether a meniscus resection was carried out or not. Accordingly, if we made a wrong decision concerning chondral lesions, they should still be evenly divided into the two study arms, thus not compromising our primary objective.
Interpretation of the results: introducing various potential scenarios
To illustrate the various potential findings of the study, we have fitted different scenarios within a larger context of Cochrane's three-step hierarchy of evidence (figure 2B). This scheme outlines the requirements for evidence to justify claims of proven efficacy, effectiveness and, ultimately, the cost-effectiveness of APM.
Quantifying the placebo effect
The placebo effect refers to a positive change in symptoms related to a participant's perceptions of the treatment rather than to the mechanisms of the treatment itself.21 After the publication of their pilot study,92 Moseley et al were questioned about the possibility that patients who enrolled in their placebo surgery controlled study would be more ‘suggestible’ and therefore more susceptible to the placebo effect.93 Moseley responded by proposing two alternative scenarios: either the patients in their RCT would be more susceptible to the placebo effect than the patients who declined to be in the study (suggesting that the characteristics and personality traits of patients who choose to enrol in the studies are different from those declining) or patients who enrol in the RCT are more susceptible to the placebo effect than patients who undergo surgery outside a study. For the sake of brevity, we will not enter into a lengthy elaboration of these two alternative interpretations (which can be found in the detailed response by Moseley100).
Translating this line of thought into our study design, potential differences in the responses to treatment between the RCT study groups (C, D and E, figure 2A) are attributable to either the effect of resection (groups C vs D, figure 2A) or the magnitude of the placebo effect (groups C vs E, figure 2A). What makes our design unique is the opportunity to actually disentangle (quantify) the respective effects of ‘resection’ and ‘placebo’. Further, by comparing (with appropriate baseline adjustment) these estimates with the treatment responses observed in the patients who have undergone APM outside our study (group B, figure 2A) and the more heterogeneous pragmatic cohort (group A, figure 2A), we should also be able to extrapolate the potential effectiveness of the procedure (figure 2B). Theoretically, the placebo effect is more robust in the ‘open label’ (E) and pragmatic groups (A and B) than in the efficacy trial groups (C and D) due to the doubt inherent in any ‘blinded’ RCT (figure 2A). To summarise, there is a placebo effect in all five study groups, not just the placebo surgery group (D).
In this article, we present a protocol for a randomised placebo surgery controlled trial to assess the efficacy of APM in patients with degenerative meniscus tear. We have also discussed some of the methodological issues that we feel are most pertinent to the successful execution of a controlled surgical trial, particularly with respect to minimising bias and maximising the internal and external validity of the study.
We would like to thank Timo Järvelä, MD, PhD for his help and expertise during the design phase of the trial, Heini Huhtala, MSc (Tampere School of Public Health/Biostatistics) for her help in planning the statistical analyses, and research coordinator Pirjo Toivonen for her vital role in implementation. We would also like to acknowledge Kari Tikkinen, MD, PhD for his critical comments on the manuscript and Virginia Mattila, MA for linguistic expertise and language revisions.
Contributors TLNJ, RS, MP and AM designed the efficacy study. TLNJ and RS designed the ‘RCT within-a-cohort’ extension and wrote the manuscript. MP and AM provided substantive input on drafts of the manuscript. All authors contributed to the refinement of the study protocol and approved the final manuscript. TLNJ is the principal investigator of the study.
Funding This project is supported by the Sigrid Juselius Foundation, the Competitive Research Fund of Pirkanmaa Hospital District grant number: 9M121, the Social Insurance Institution of Finland grant number: 50/26/2012, and the Academy of Finland grant number: 259503. The funding sources had no role in the design of this study and will not have any role during its execution, analyses, interpretation of the data or decision to submit results.
Competing interests None.
Patient consent Obtained.
Ethics approval The institutional review board of the Pirkanmaa Hospital District (R06157).
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.