Article Text


Premarket evaluation of medical devices: a cross-sectional analysis of clinical studies submitted to a German ethics committee
  1. Stefan Sauerland,
  2. Naomi Fujita-Rohwerder,
  3. Yvonne Zens,
  4. Sandra Molnar
  1. Department of Non-Drug Interventions, Institute for Quality and Efficiency in Health Care (IQWiG), Cologne, Germany
  1. Correspondence to Dr Stefan Sauerland; stefan.sauerland{at}


Objective To assess the methodological quality of pre-market clinical studies performed on medical devices (MDs), including in-vitro diagnostic (IVD) MDs, in Europe.

Design Observational cross-sectional study.

Setting A large German ethics committee.

Materials From the consecutive sample of study applications between March 2010 and December 2013, we selected MD study applications requiring approval by an ethics committee and the competent federal authority. These included pre-market studies on devices that had not yet received a CE (Conformité Européenne) mark or had previously been CE marked for a different indication. Also included were post-CE studies requiring federal authority approval because the study entailed additional invasive or otherwise burdensome components.

Primary and secondary outcome measures Besides the design of the studies, we assessed the planned sample size, study duration and other aspects.

Results 122 study applications were analysed: 98 (80%) concerned therapeutic rather than diagnostic devices and 84 (69%) were pre-market studies. The proportion of studies on class I, IIa, IIb and III devices was 10%, 15%, 28% and 39%, respectively. 10 studies (8%) investigated IVD MDs. A randomised controlled trial (RCT) was planned in 70 (57%) of the 122 applications; studies with non-randomised control groups (n=23; 19%) or without controls (n=29; 24%) were less common. In the sub-group of pre-market studies on therapeutic devices, the proportion of RCTs was 66% (43/65). The median sample size was 120 participants or samples (IQR 53–229). The median study duration was 24 (14–38) months. 87 studies (71%) considered at least one patient-relevant outcome. 12 (17%) and 37 (53%) of the 70 RCTs applied a fully or partially blinded design, respectively.

Conclusion A large proportion of MD studies in Germany apply a randomised controlled design, thus contradicting the industry argument that RCTs on MDs are commonly infeasible.

  • medical device
  • device approval
  • clinical studies as topic
  • prostheses and implants
  • premarket approval

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

View Full Text

Statistics from

Strengths and limitations of this study

  • This is one of the very few analyses of clinical research on new medical devices (MDs) in Europe, because the MD approval system in the European Union is highly opaque.

  • We identified a large and consecutive sample of clinical studies, which allowed us to also analyse temporal trends in study quality.

  • Mainly because of confidentiality reasons, it was not possible to follow-up study applications and link them with later publication of results or approval of MDs.

  • Due to the possible over-representation of multi-centre studies in our sample, the present results may be slightly too positive.

  • The study’s findings need to be updated in the future, as the Medical Device Regulation (MDR) came into force in May 2017 and may improve the quality of pre-market MD studies.


The regulatory framework governing the market access of new medical devices (MDs), including in-vitro diagnostic (IVD) MDs, in the European Union (EU) has turned out to be insufficient and outdated.1–5 In May 2017, new regulations on MDs/IVD MDs were introduced (MDR/IVDR) and will become legally binding in 2020.6 7 Although the necessity for this reform is mainly associated with the scandal of intentionally faulty breast implants,8 the current European regulatory framework for MDs/IVD MDs generally needs to be strengthened to increase uniformity and transparency. IVD MDs are a special form of MDs, but are not classified into the same risk classes as conventional MDs. In the following text, unless otherwise specified, the term ‘MD’ also covers IVD MD.

Under the currently valid MD directive,9 patients and clinicians have no legal right to know what clinical data are available on a new MD entering the European market.10 Manufacturers submit their clinical data to one of about 60 notified bodies authorised to issue a CE (Conformité Européenne) certificate, which allows the marketing of the device within the EU. However, the current directives are vague as to when and what type of clinical studies are required. Besides safety and performance, the manufacturer has to show that the use of the device has ‘acceptable risks when weighed against the benefits’.9 Previous studies have found large differences between study designs, with the proportion of randomised controlled trials (RCTs) ranging from less than 10%11 to about 50%,12 depending on device type, risk class and regulatory pathway.

However, major analyses of pre-market clinical studies on MDs have so far all been limited to the US market.11–20 In Europe, this field of clinical research can only be analysed under confidentiality rules, as study applications to regulatory bodies and ethics committees are classified as commercially confidential information in order to protect the intellectual property of manufacturers. We nevertheless sought to explore and describe the quality of clinical research conducted in Europe on MDs, especially in the pre-market phase.


Study sample

In this retrospective cross-sectional analysis conducted between October 2014 and June 2015, we screened all study applications submitted to the Berlin Ethics Committee between 21 March 2010 and 31 December 2013. The starting date was chosen because on 21 March 2010, a change in the European regulation (Directive 2007/47/EC) became binding through a revision of national German law. This revision introduced the concept of a benefit–risk ratio and strengthened the parallel consultation of study applications by ethics committees and federal authorities. By law, MD studies require this type of approval if one of the following conditions is fulfilled: (i) The study aims to assess an MD that has not yet been CE marked. (ii) The study aims to assess a CE-marked MD, but the MD is applied outside the intended purposes designated in the CE certificate. (iii) The study aims to assess a CE-marked MD within the designated purposes, but the study protocol includes additional invasive or otherwise burdensome components not usually included in the routine treatment of patients (eg, additional visits or imaging procedures). In the following text, the term ‘premarket MD studies’ refers to categories (i) and (ii) and the term ‘MD studies’ refers to all three categories.

The Berlin Ethics Committee is legally responsible for all drug and MD studies conducted in the federal state of Berlin. University hospitals in Berlin also have ethics committees, but they are not allowed to approve drug or MD studies. The role of an ethics committee is either ‘responsible’ or ‘involved’, depending on whether it is responsible for the principal investigator who supervises the study or whether a different ethics committee is responsible. In Berlin, a total of about 81 hospitals serve a population of nearly 3.6 million. Given the mixture of hospitals (university-affiliated, teaching and other hospitals), the setting appeared suitable for a representative evaluation of clinical research on MDs.

Patient and public involvement

Due to the focus on regulatory science, neither patients nor public were involved in the design or conduct of the present study.

Data extraction and analysis

After being instructed on confidentiality rules by the lawyer of the Berlin Ethics Committee, data extraction was performed by three experienced researchers (NF-R, YZ, SM) who were granted full access to the files of MD study applications at the Committee’s office. These files included the study protocol and all correspondence between applicants and the Committee. In those cases where the study protocol had been amended, data were extracted from the most recent version and reasons for amendments were also recorded. Germany has established a centralised information system where study sponsors have to enter all key information on a planned MD study. This information is then distributed to the competent ethics committee and the competent federal authority, which is usually the Federal Institute for Drugs and Medical Devices (BfArM). It was thus also possible to record those cases in which studies had not been approved by the competent federal authority.

Data were extracted on the type and risk class of MD, study aim and design, type of comparator, predefined primary and secondary outcomes, study duration, sample size, pre-specified data analysis and study registration in a public registry. If, after screening the information from the Berlin Ethics Committee, no registration was found for a study, we searched and the internet for this information. With regard to the aim of a study, if an application mentioned two or more aims (eg, safety and efficacy), the highest aim was recorded according to the following ranking: patient-relevant benefit, efficacy, safety and performance. A study outcome was judged as patient-relevant if it directly measured how a patient feels, functions or survives. Following pre-specified criteria,21 this included mortality, morbidity (symptoms, functional status, etc), adverse effects and health-related quality of life. If study participants did not suffer from a disease but were treated for other reasons (eg, aesthetic indications), patient-reported outcomes covering these reasons were also accepted as patient-relevant. Blinding was classified as ‘implemented’ if either study participants, treating staff or outcome assessors were unaware of the study interventions. The adequacy of allocation concealment was judged according to the Cochrane Collaboration guidelines. All information was extracted from the study application files by one person and checked independently by a second person. Discrepancies were resolved by consensus.

Data were stored and analysed using Microsoft Excel 2010 and PASW Statistics V.18. Descriptive analyses were performed on an exploratory basis. Further analyses focused on the possible associations between type of MD and aspects of study design. All extracted data were kept in an anonymous format, thus preventing any inferences to specific device products, study sponsors or MD manufacturers. Due to this anonymity, after database closure it was not possible to follow the subsequent progress of studies until completion of patient recruitment or the publication of study results.


General characteristics of the studies included

In total, 122 study applications on 122 different MDs were analysed. No application was excluded. Table 1 summarises key characteristics of the 122 studies. At database closure, 65 studies were ongoing, 26 had been completed and 13 had been terminated early after inclusion of at least one participant. For a further nine studies, applications had been retracted, two studies had never been started after approval and three studies were not approved by the Berlin Ethics Committee (n=2) or the competent federal authority (n=1, approval revoked).

Table 1

Description of the 122 study applications

Most studies investigated therapeutic devices. There were 84 (69%) pre-market studies. Among these, 15, 24 and 27 studies assessed class IIa, IIb or III devices, respectively. The studies were to be conducted in a wide range of medical fields, mainly including cardiology (n=30, 25%) and vascular medicine (n=18, 15%), Pre-market, as opposed to post-market, studies were slightly less frequent in cardiology and vascular medicine (29 of 48, 60%) than in other medical fields (55 of 74, 74%).

Design aspects of the MD studies

In 70 (57%) of the 122 applications, the planned study was designed as an RCT. The remaining studies had either a non-randomised control group (n=23; 19%) or no controls (n=29; 24%). The proportion of RCTs steadily increased from 46% (in 2010) to 55%, 61% and 66% in 2011, 2012 and 2013, respectively. Only one of the 24 diagnostic studies was an RCT, in contrast to 70% (69 of 98) of studies on therapeutic MDs (tables 2 and 3). The study design was not associated with the risk class of a therapeutic MD. In the subgroup of pre-market studies, 66% of the studies were RCTs. Furthermore, 33 (47%) of the RCT applications contained an adequate description of how randomisation sequences had been generated. In 50 (71%) of the RCT applications, the method of allocation concealment was explained and was adequate.

Table 2

Design aspects by type of intervention

Table 3

Study design by risk class of medical device

About half of all studies (62 of 122) aimed at showing efficacy. In the remaining applications, the highest aim of the study was either safety (n=22), performance (n=15) or patient-relevant benefits (n=1). A further 22 applications had other aims, such as feasibility or user satisfaction. Among the 62 studies on efficacy, 11 lacked a control group. Due to two three-arm studies, there were a total of 72 control groups. In addition to standard treatment, the control interventions consisted of another MD (n=34), no MD (n=11), a sham intervention (n=11), drug therapy (n=7), active surveillance (n=5), a different standard treatment (n=3) or a pharmaceutical placebo (n=1). All 12 studies using sham or placebo controls were RCTs.

Patient-relevant outcomes were to be assessed in 87 studies (71%), while 10 studies (8%) failed to examine any such outcome. In 25 mainly diagnostic studies, the aim of the study (eg, diagnostic accuracy) rendered the assessment of patient-relevant outcomes unnecessary. Of the 87 studies that examined patient-relevant outcomes, 63 were RCTs. No association between the assessment of patient-relevant outcomes and other study aspects such as risk class or CE-marking was apparent. The primary outcome was patient-relevant in 44 of 122 (36%) studies. No primary outcome at all was defined in eight studies, including two RCTs. In the sub-group of RCTs (n=70), 37 (53%) incorporated at least some form of blinding.

The planned sample sizes of the studies ranged from 5 to 2456 patients (or samples in diagnostics). Table 4 shows that studies on class III devices and IVD MDs tended to have a slightly larger sample size. The estimated study duration ranged from 3 to 85 months, with longer study duration being associated with a higher MD risk class. Likewise, the planned length of follow-up also increased with the MD risk class. Compared with single-centre studies, multi-centre ones were more likely to have an RCT design, were larger in sample size and longer in follow-up (table 5).

Table 4

Planned sample size, study duration and length of follow-up as a function of study design and risk class

Table 5

Design aspects by organisational size of study

Study registration and the influence of the ethics committee

More than 1 year after the study application had been submitted, 75 (61%) of studies were registered in a public study registry approved by the WHO. In the vast majority of cases (n=71), study applicants had chosen There were 19 unregistered RCTs (27%).

The Berlin Ethics Committee was primarily ‘responsible for’ rather than only ‘involved in’ 40 studies. Critical comments regarding the study aim or design were issued for 22 studies. The points of criticism included the sample size (n=12), aim and hypothesis (n=8), statistical analysis (n=8), measures against bias (n=5) and other reasons (n=5). Most comments were either resolved by the study applicants (n=15) or still ongoing at the time of database closure (n=2). Only four study applications were retracted or not approved (n=2 each).


According to the European Association of Medical Technology Industries, ‘most devices cannot be evaluated with randomised clinical trials as it is hard to blind and randomise devices due to strong ethical and practical issues’.22 In contrast, the main finding of the present analysis was the surprisingly high proportion of MD studies with a randomised controlled design. In various public discussions, all involved parties, that is, manufacturers, regulatory agencies, ethical committees and clinical researchers, had previously assumed that less than 10% or 20% of studies were randomised or comparative in design.23 This common belief was supported by the fact that several novel medical devices had received a CE mark, despite the fact that only small case series had been performed. Examples include the magnetic oesophageal sphincter,24 the leadless cardiac pacemaker25 or transcatheter aortic valve implantation.26 In view of the present analysis, these examples appear to be a vanishing regulatory pathway. Nevertheless, even the new European MDR fails to contain a clear recommendation on the study design. It only stipulates that the ‘level of clinical evidence shall be appropriate in view of the characteristics of the device and its intended purpose’.6

In the past, no systematic analysis of MD studies performed in Europe was available, due to various confidentiality issues surrounding premarket devices. The mere fact that the present analysis was completed is therefore a success on the road to increased transparency. Furthermore, our sample of MD studies appears to be representative for Germany and probably also for other European countries: According to personal communication with the competent federal authority (BfArM), 885 MD studies were applied for in Germany during the 4 year period covered by the present analysis. Thus, the Berlin Ethics Committee received about 14% of all German MD study applications. If 389 applications on regulatory exemption received by BfArM are excluded, this proportion rises to 25%. Inevitably, multi-centre studies are, however, over-represented in a local or national sample of studies. An analysis of all MD studies in Europe or the world would result in a higher proportion of single-centre studies. As such studies are of lower quality, the current analysis somewhat overestimates the quality of clinical research on MDs.

Furthermore, studies approved by an ethics committee in Europe are not necessarily comparable to all clinical studies that have been conducted on new MDs and used for CE marking. Manufacturers might want to conduct higher-quality studies in Europe but conduct lower-quality studies outside Europe. Such an imbalance in the field of clinical research is nevertheless unlikely, and it is more plausible to assume that MD manufacturers conduct most of their clinical studies in Western countries such as Germany. Still, the present study was unable to reveal how many new MDs are not evaluated in clinical studies before obtaining a CE mark.

Previous studies have shown that safety alerts and recalls of MDs are nearly three times more common in the USA than in the EU.5 27 When comparing the present results with similar studies conducted in the USA, it is important to keep the differences between the EU and the US regulations in mind. The US studies were also restricted to certain groups of high-risk devices. In general, the proportion of RCTs ranged between 27% and 45%,12 13 16 which is less than half of the proportion in the present study. Under both regulations, there is a clear time trend towards higher-level evidence. It is possible that manufacturers already anticipated the rising need for higher-level clinical evidence as now laid out in the new European MDR. However, stricter regulatory requirements cannot suffice as the only explanation for increased study quality, since this increase also covered class I and IIa devices, for which regulatory requirements remain lenient. The rising importance of health technology assessment (HTA) had probably led to an increased demand to prove that new devices offer specific benefits in terms of higher effectiveness rather than only safety and performance. In this sense, the regulatory field and the HTA field have moved closer together.28

It was also promising to find that 36% of all studies had a patient-relevant primary outcome. This figure is higher than the 12% rate reported by Dhruva et al in 2009.16 In 2015, an analysis of publicly registered MD studies found a rate of 30%.29 Still, further improvement is required, also because the new European MDR mandates that, for an evaluation of ‘clinical benefits, performance and safety … the primary endpoint shall be appropriate to the device and clinically relevant’.6 It remains to be seen whether all regulators in the EU will develop the same understanding of patient-relevant. The proportion of RCTs with blinding of participants, treating staff or outcome assessors was about 50%. This result compares well with pre-market drug studies, where blinding is present in about 80% of studies.30 Because of the physical effects of MDs, blinding is usually more difficult than in drug research. It is thus not surprising that among all studies with some form of blinding, only one-third (12 of 37) used a sham device, sham procedure or placebo in order to maintain blinding for study participants and treating staff. Still, blinding of outcome assessors is important and appears quite possible in the majority of studies.

Sample size and study duration are other important aspects where drug and device research differ. Pivotal pre-market studies on new drugs include a median of about 500 patients,30 31 which is clearly higher than the 120 patients found in the present analysis. On the other hand, only about a quarter of pre-approval drug trials follow-up their participants for 6 months or longer.30 The median follow-up duration of 12 months in the present sample of MD studies thus compares very well with drug trials. However, this difference could also be explained by the fact that drugs, as opposed to MDs, are more often used to treat diseases with a short-term to medium-term course (eg, certain types of infections or cancers). The fact that 27% of RCTs were not registered in a public, WHO-approved study registry is disappointing, as non-registered studies have impaired scientific credibility. This finding shows that non-pharmacologic clinical research is still lagging behind, possibly because of insufficient methodological knowledge in smaller research units and companies. Future regulations should effectively prevent non-registration and non-publication of any clinical MD study.10 Although the MDR unfortunately fails to require prospective registration of MD trials, the European Database on Medical Devices (EUDAMED) will lead to more transparency. Going public in 2020, EUDAMED will contain information regarding MDs on the EU market and the clinical studies of these MDs. Importantly, clinical study results have to be published after a year, even if the MD failed to receive a CE mark. Future clinical research on MDs can thus be performed on the basis of publicly available data, which represent a major breakthrough when considering the confidentiality hurdles faced by the present analysis. The only exception are academic-led clinical studies, whose results are not primarily intended to support CE marking. As neither registration nor publication of these studies is defined at the European level, national law should ensure full transparency, also for this field of MD research.

In summary, the present data allow an optimistic outlook on clinical research on new MDs. The manufacturers’ argument that RCTs are often infeasible and do not represent the gold standard for MD research is clearly refuted. As high-quality evidence is increasingly common for pre-market studies, it is obviously worthwhile to secure these standards through the MDR in Europe and similar regulations in other countries.


The authors are grateful to Christian von Dewitz (Berlin Ethics Committee), Jürgen Windeler and Stefan Lange (both IQWiG) for their support, insightful guidance and suggestions during the course of this study. The authors also thank Natalie McGauran (IQWiG) for editorial support.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
View Abstract


  • SS and NF-R contributed equally.

  • Contributors Conceptualisation: NF-R, SS. Investigation: NF-R, SM, YZ. Data curation and validation: NF-R, YZ. Formal analysis: NF-R, SS. Writing – original draft preparation: SS. Writing – review & editing: NF-R, SM, YZ.

  • Funding This work was funded by IQWiG, which itself is funded through levies on health care services in Germany. As requested by German law, IQWiG is a professionally independent, scientific institute. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement A more detailed analysis in German is available from the IQWiG website ( A spreadsheet recording the individual 122 studies is available on request from the authors’ institution (

  • Patient consent for publication Not required.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.