Objectives To assess biostatistical quality of study protocols submitted to German medical ethics committees according to personal appraisal of their statistical members.
Design We conducted a web-based survey among biostatisticians who have been active as members in German medical ethics committees during the past 3 years.
Setting The study population was identified by a comprehensive web search on websites of German medical ethics committees.
Participants The final list comprised 86 eligible persons. In total, 57 (66%) completed the survey.
Questionnaire The first item checked whether the inclusion criterion was met. The last item assessed satisfaction with the survey. Four items aimed to characterise the medical ethics committee in terms of type and location, one item asked for the urgency of biostatistical training addressed to the medical investigators. The main 2×12 items reported an individual assessment of the quality of biostatistical aspects in the submitted study protocols, while distinguishing studies according to the German Medicines Act (AMG)/German Act on Medical Devices (MPG) and studies non-regulated by these laws.
Primary and secondary outcome measures The individual assessment of the quality of biostatistical aspects corresponds to the primary objective. Thus, participants were asked to complete the sentence ‘In x% of the submitted study protocols, the following problem occurs’, where 12 different statistical problems were formulated. All other items assess secondary endpoints.
Results For all biostatistical aspects, 45 of 49 (91.8%) participants judged the quality of AMG/MPG study protocols much better than that of ‘non-regulated’ studies. The latter are in median affected 20%–60% more often by statistical problems. The highest need for training was reported for sample size calculation, missing values and multiple comparison procedures.
Conclusions Biostatisticians being active in German medical ethics committees classify the biostatistical quality of study protocols as low for ‘non-regulated’ studies, whereas quality is much better for AMG/MPG studies.
- medical ethics
- statistics & research methods
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
This is the first survey among biostatisticians active in German medical ethics committees to assess the individual assessment of the quality of and the completeness of information on biostatistical aspects in the submitted study protocols.
Although having put much effort in searching for all biostatisticians active in German medical ethics committees, the target population was not completely identified.
Confidentiality issues did not allow direct and objective assessment of individual study protocols’ content.
This survey classified study protocols as regulated by the German Medicines Act/German Act on Medical Devices and those non-regulated by these laws, where the latter covers a very heterogeneous group of studies for which the statistical requirements are not all the same.
This survey was conducted too early to study the impact on recent revisions on the statistical concepts of estimands.
Medical ethics committees (institutional review boards) aim to judge the quality and validity of medical studies in order to ensure an ethically justifiable positive benefit–risk profile.1–3 Their members do not only assess the submitted material but also act as consultants. Besides medical content, the board verifies legal and scientific validity including biostatistical aspects of the study design and analysis strategy.4–6 Although general guidance for a good biostatistical practice in medical research projects exists, there is no consensus and only limited guidance to what extent medical ethics committees should assess these statistical aspects.5–8 According to the last revision of the German Drug Regulation Law in 2016 (Bundesgesetzblatt, § 41a), a biostatistician is a mandatory member of a medical ethics committee, next to medical as well as legal experts, and lay persons.9 However, not all medical ethics committees appraise legally regulated studies in which case a biostatistician is not mandatory.10–13 Moreover, medical research is faced with the new challenges related to the digitalisation of the health system and the focus on personalised medicine, which also brings along new tasks and perspectives for medical ethics committees.14
Purpose or research question
To increase the biostatistical quality of study protocols, standards for biostatistical reporting and for biostatistical reviewer comments have to be implemented in Germany that account for the fact that the organisation and composition of German medical ethics committee organisation are quite heterogeneous. Of course, on the long-run international standard have to be agreed on. To achieve this global aim, the first step is to assess the current level of statistical quality of submitted study proposals, so that gaps and challenges can be identified.
For this purpose, we conducted a comprehensive survey among biostatisticians who were active in German medical ethics committees between 2016 and 2018. The aim was to evaluate and quantify the personal assessment of the participants of the quality and completeness of statistical aspects in clinical study protocols submitted to German medical ethics committees.
A direct judgement of the statistical quality of study protocols would have required the assessment of relevant protocol extracts or even entire study protocols by the experts. This was, however, not possible due to enforced data protection mechanisms. Many medical ethics committees argued that original protocols (even partly and anonymised) could not be made available for the planned assessment without impairing the trust which is an essential part of the medical ethics committee’s standing.
To overcome this problem, we decided to ask biostatisticians in medical ethics committees to give a global, personal assessment on specific issues of the statistical quality and completeness of study protocols. On the one hand, the individual impression does not objectively reflect the ‘true’ quality. On the other hand, objective quality criteria are very hard to define and would definitively impose a need for a controversial discussion. Therefore, the individual global quality assessment of biostatisticians in medical ethics committees provides an informative marker to at least roughly assess current standards and problems. Biostatisticians in medical ethics committees review many study protocols and can well reflect the statistical problems currently met. From these findings, we can identify statistical topics which need an enforced focus, for example, within the framework of Good Clinical Practice courses, specific training addressed to the statistical reviewers to improve clarity of statistical reviewer comments and other training addressed to the medical investigators to improve their statistical knowledge.15–17
Qualitative approach and research paradigm
This study is a comprehensive systematic survey among biostatisticians who were members of German medical ethics committees between 2016 and 2018.
Researcher characteristics and reflexivity
The questionnaire was developed by a senior biostatistician, reviewed and extended by five independent biostatisticians including two professors of biostatistics, two senior biostatisticians and one bachelor student with a limited background in biostatistics. The latter person was consulted in particular to assess the comprehensibility of the wording.
All authors, who developed the survey, analysed the data and wrote this article, are members of the joint project group ‘Biometry in ethics committees’ of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS) e. V. and the German Region of the International Biometric Society (https://gmds.de/aktivitaeten/medizinische-biometrie/arbeitsgruppenseiten/projektgruppen/biometrie-in-der-ethikkommission/, accessed September 2019). This group was founded in 2017 and aims to strengthen the work of biostatisticians in medical ethics committees by offering specific training (in methods as well as communication of statistical issues to non-statisticians), establishing a communication network for mutual support and developing specific guidelines, which allow a standardised, high-quality statistical review of study protocols.
Data collection instruments and technologies
The questionnaire was implemented as an online survey (www.umfrageonline.com, accessed September 2019). The survey consisted of 31 items, which were grouped in 11 steps/pages in the online survey. Questions were formulated in German. The original survey can be found here (https://www.umfrageonline.com/s/6b2e8f4&preview=1&DO-NOT-SEND-THIS-LINK-ITS-ONLY-PREVIEW, accessed September 2019). English translations are provided in online supplementary appendix 1.
The first item checked the key inclusion criterion if the respondent served as statistical expert in a medical ethics committee within the last 3 years. Only persons who answered this question positively were included in the final analysis. The last item evaluated if the respondent enjoyed the survey. Of the remaining 29 main items, 4 items characterised specific features of the medical ethics committee and the review process within the medical ethics committee. We asked (1) for the type of the medical ethics committee (ethics committee of a medical faculty, of a State Chamber of Physicians (Landesärztekammer) or other), (2) for the federal state in Germany, where the medical ethics committee is located, (3) how many studies the respondent reviews on average per year (in steps of 50) and (4) if the respondent is exclusively responsible for study proposals according to the German Medicines Act (AMG)/German Act on Medical Devices (MPG) or also for studies that are non-regulated by these laws which will be referred to in the following briefly as 'non-regulated studies'. In case the respondent’s medical ethics committee is responsible for regulated as well as for non-regulated studies, another (conditional) item asked whether the statistical quality of study protocols is better in the regulated compared with the non-regulated setting.
Additionally, 2×12 items asked for assessing the completeness and correctness of different biostatistical aspects (12 for the regulated setting, 12 for the non-regulated setting conditional on the responsibilities of the specific medical ethics committee as marked in the previous item). Participants were asked to complete the sentence ‘In x% of the submitted study protocols, the following problem occurs’, were 12 statistical problems were addressed (eg, ‘specification of the significance level is missing’). Participants could provide percentages in steps of 10% (0%–100%) with a higher percentage indicating a worse result. In principle, items formulated in the way ‘In x% of the submitted study protocols, the following problem occurs’ could have also been assessed on an interval scale by allowing for continuous specifications of percentages. This would have pretended a quantitative and objective assessment. However, the subjective impression is surely neither quantitative nor completely objective.
An additional item asked for the need to refresh the statistical knowledge of the, medical or epidemiological investigator on a certain topic to be selected out of a list of 9 statistical topics with the option to add additional ones. In addition, this need had to be assessed as low, medium or high.
The online survey system saved the answers of the participants in a central database where data can be downloaded in various formats. An extended group of experts including the authors validated the online survey by testing and commenting it. Final corrections were integrated after the validation phase.
Units of study
The study population is defined as all biostatisticians being members of a German medical ethics committee between 2016 and 2018.
To identify the study population, a web search on the homepages of all German medical ethics committees was performed which resulted in a preliminary email list. All these email addresses were freely available on the web. Moreover, several persons known to be active in ethics committees were asked to complete this list in agreement with the specific biostatistician. The final list of eligible candidates consisted of 86 biostatisticians.
Data collection methods
The call to participate in the survey was sent out by email on 28 November 2018 with a reminder on 13 December 2018. The survey was opened until 15 December 2018. Some participants actively asked for the possibility to slightly extend the deadline on which we agreed. The original survey was, however, open only between 1 December 2018 and 31 January 2019, so that after that time period data collection was completed.
Techniques to enhance trustworthiness
The webpage for the survey was not publicly published to avoid participation of persons not belonging to the study population. Still, in principle, anyone who was aware of the link could have participated in the survey. There is no way to check for fulfilment of the inclusion criterion or correctness of the provided answers. However, as the link was not easily available and the study population was likely to be highly compliant, this risk seems to be minor. The survey could be completed only once from a single IP address. In principle, participants could have completed the survey several times using different IP addresses, which seems very unlikely. As the survey was anonymous due to data protection reasons, it is impossible to verify such fraud. However, as there was no benefit in completing the survey more than once, this risk seems to be minor. We did not advertise the survey by means of global mailing lists within biostatistical societies, although this approach was discussed. A limited and focused mailing list is preferable as otherwise no responders’ proportion could have been evaluated which is a crucial quality indicator of a survey. Moreover, a global mailing list would have included a large proportion of recipients who did not fulfil the eligibility criteria.
Ethical issues pertaining to human subjects
The online survey was anonymised and most items included an option for providing no answer. The question asking for the medical ethics committee’s federal state allowed for a potential reidentification of the respondent in case of a small federal state like Bremen. Respondents were therefore free to leave this box blank. No personal data were collected from the participants. Participation in the survey was voluntary and not rewarded. To enable reproduction of the results presented in this paper, the final dataset is freely available from the Dryad repository (https://datadryad.org/stash/share/VjofDJkoUtqIjjaJQ84O7FB1W5cJu78g04cNey044no), without any information on the medical ethics committee’s federal state to avoid any risk of potential reidentification. No ethical approval was necessary for this voluntary survey in healthy participants without any risk of putting harm to the respondents and without any direct medical research focus.
This is an exploratory study, which is analysed using descriptive statistical methods. Items 1–6 as well as items 30 and 31 are simple categorical items assessed on a multiple-choice basis. The items were evaluated by means of absolute and relative frequencies. The 2×12 items asking for the assessment of the completeness and correctness of different biostatistical aspects are Likert-scaled ordinal variables with 11 possible outcomes (0%, 10%, …, 100%). For these items, we reported absolute and relative frequencies and graphically displayed them as stacked bar charts. Moreover, we provided medians, quartiles and grouped boxplots, where two groups of studies are considered (regulated vs non-regulated studies). All analyses were performed using the statistical software R, V.3.5.1. The original dataset (excluding information on the specific German federal state) is freely available from the Dryad repository (https://datadryad.org/stash/share/VjofDJkoUtqIjjaJQ84O7FB1W5cJu78g04cNey044no) to allow for reproducibility.
Patient and public involvement
This survey does not include patients or the general public. The design and the development of the survey was intensively discussed by the members of the joint project group ‘Biometry in ethics committees’ of the GMDS e. V. and the German Region of the International Biometric Society.
Table 1 shows the characteristics of the medical ethics committees to which the 57 participants of the survey are appointed. Note that the number of answers differ per item, as the online survey offered the option to abstain from answering a specific question. A majority of 46 participants (80.7%) were members of an ethics committees of a medical faculty of a university, whereas 15 (26.3%) were members of an ethics committee of a State Chamber of Physicians (Landesärztekammer). Some participants were members of more than one medical ethics committee at the same time.
A total of 47 participants answered the question on the location of the medical ethics committee to which they were appointed as member. The medical ethics committees are located in 12 out of 16 German federal states, where 14 (28.6%) participants were members of medical ethics committees in Northrhine-Westphalia and 7 (14.3%) in Baden-Württemberg.
A total of 18 (32.1%) participants reviewed up to 50 study proposals per year, 14 participants (25.0%) reviewed between 51 and 100 study proposals per year and 24 participants (42.0%) reviewed more than 100 study proposals on average per year.
The vast majority of 50 participants (89%) reviewed both—study proposals according to AMG/MPG and study proposals in a non-regulated setting.
With respect to the general biostatistical quality of study protocols, table 2 displays the results of items 6 and 11. Only 47 participants answered Item 6, assessing if the statistical quality of ethical proposals generally differs between regulated and the non-regulated studies. As this item was placed in the survey before the specific biostatistical aspects were named, this general formulation seemed to be difficult to understand for a large part of the participants. Out of 47 participants in total who responded to item 6, a majority of 45 (95.7%) stated that study protocols under regulatory requirements (AMG/MPG) are on average of higher statistical quality compared with studies without such requirements. The remaining 2 (4.3%) participants stated that there is no difference on average. Item 11 asked how the participants considered the need for additional training in different statistical areas addressed to the investigators submitting protocols (see table 2). A high need for a training was especially identified for ‘handling of missing values’ as indicated by 34 participants (75.6%), for ‘sample size calculation’ as indicated by 27 participants (60.0%), for ‘multiple comparison procedures’ 26 (57.8%) and for ‘adjustment for covariables’ 26 (57.8%). For all topics, at least 70% of participants considered the need for a refreshment of statistical knowledge as middle or high.
Items 7–10 asked to assess completeness and correctness of biostatistical aspects while distinguishing studies according to the regulatory setting (AMG, MPG) (items 7 and 8) and studies without regulatory requirements (items 9 and 10). Participants were only able to judge those study types (regulated and/or non-regulated) that were specified in item 5. Participants were asked to complete the sentence ‘In x% of the submitted study protocols, the following problem occurs’, where 12 statistical problems were formulated (eg, ‘specification of the significance level is missing’). Participants could give the percentage in steps of 10% (0%–100%) with a higher percentage indicating a worse result. Note that, study protocols submitted to German ethics committees can cover various types of research including observational studies, retrospective analyses and surveys. Not all of the 12 aspects formulated below are applicable to all types of studies. The requested percentages refer to the average of all studies, which have been reviewed by the participant. However, there also was the option to classify a specific aspect as ‘not assessable’. As a consequence, the number of valid responses varies per item, where lower values might indicate items that are more difficult to judge. Moreover, some participants interrupted the survey after the assessment of some items, probably because the statistical problems are repeated for regulated and non-regulated studies, which might have decreased the motivation. The results referring to the valid responses are presented in table 3 and displayed in figure 1 as grouped boxplots. Additionally, figure 2 displays the percentages of the item categories for both study types as stacked bar plots.
It turns out that protocols of non-regulated studies tend to be of much lower statistical quality and show a lower level of completeness, regardless of the specific topic. Differences in medians of the ratings between regulated and non-regulated studies range between 20% and 60%. The statistical aspects ‘missing values’, ‘multiple comparison problems’ as well as ‘adjustment for covariables’ show the highest discrepancies between both study types. For instance, the statistical methods are not sufficiently specified in 80% on average (median) for non-regulated study proposals whereas this is the case in only 20% on average (median) for studies with regulatory requirements. Similarly, for non-regulated studies only general statements on statistical analysis methods are provided not fitting and addressing the specific study aim in 70% on average (median), whereas this is only stated in 10% on average (median) for regulated studies. For non-regulated studies, all 12 statistical aspects show high deficiencies, while only in 10% on average (median) of all proposals of regulated studies aspects are mentioned which have not been completely or correctly addressed in the proposals.
This systematic survey among biostatisticians serving in German medical ethics committees aimed to assess the individual impression of completeness and correctness of biostatistical aspects of submitted study protocols. As an overall result, the completeness and correctness of handling statistical issues in the submitted study protocols is heterogeneous. There is a notably difference in quality between study protocols with and without regulatory requirements, where the latter show major deficits. A specifically high need for refreshment was identified for ‘handling of missing values’, ‘sample size calculation’, ‘multiple comparison procedures’ and ‘adjustment for covariables’. However, there also exist quite general deficiencies for non-regulated studies, as the description of the statistical methods is not sufficiently specified. It should be mentioned that for regulatory studies the International Conference on Harmonization (ICH) E9 guideline offers guidance on how to analyse a clinical study.6 This guideline is also helpful for the non-regulated settings but may be unknown to members of this community.13
To the best of our knowledge, this is the first survey among biostatisticians who were members of German medical ethics committees and the first attempt to assess the quality and completeness of biostatistical issues in medical study protocols. Wang et al conducted a survey among biostatistical consultants with respect to the quality of reporting the statistical analysis strategy after data was already analysed.18 The four most frequently reported statistical problems were ‘removing or altering some data records to better support the research hypothesis’, ‘interpreting the statistical findings on the basis of expectation, not actual results’, ‘not reporting the presence of key missing data that might bias the results’ and ‘ignoring violations of assumptions that would change results from positive to negative’. Clark et al screened original study protocols submitted to UK ethics committees for the completeness and correctness of sample size derivation.19 They found that only 42% of the study protocols reported all information, which is required to accurately reproduce the sample size. Kilkenny et al conducted a survey of the quality of experimental design, statistical analysis and reporting of research in animal studies.20 They found that only 59% of the studies stated the hypothesis or objective of the study and the number and characteristics of the animals used und only 70% of the publications described their methods and presented the results with a measure of error or variability.20 These findings are in line with the results of our survey. In addition, Hall et al looked at the methodological quality of surgical clinical trials.21 They reported that less than 50% of the studies commented on potential bias in the assessment of the outcome, adequately described the randomisation technique, or commented on sample size calculation.21 Peng et al conducted a review on published epidemiological papers to assess the reproducibility of epidemiological research.22 They found that 30% of the publications did not report the implementation of the statistical analysis. Begley et al commented that there is a general problem of reproducibility of study results, in particular in preclinical studies.23 This goes in line with the problems of study protocols and designs reported by Ioannidis et al.17
A total of 57 (66%) of the contacted 86 persons participated and fulfilled the inclusion criteria. This corresponds to a high participation proportion. However, it remains unknown whether all potential participants were truly identified and contacted.
A further limitation of our survey is that it does not provide an overview of the objectively measured quality and completeness of biostatistical aspects of study protocols but that it only refers to the subjective, individual impression of the statistical members of the ethics committees. An objective rating of the study protocols, however, was not possible due to data and privacy protection issues, as this would have required screening of the submitted study documents. Moreover, objective quality measures are difficult to define, because as the meaning of ‘adequate’ quality might differ considerably. The subjective ratings of completeness and correctness are subject to interrater and intrarater variability. Therefore, the results should rather be interpreted as a rough indication and not as definite numbers.
As a third limitation, we consider the fact that the survey could not assess recent issues added in an addendum to the ICH E9 guideline.24 It presents a structured framework to link trial objectives to a suitable trial design and tools for estimation and hypothesis testing. This framework introduces the concept of an estimand, translating the trial objective into a precise definition of the treatment effect that is to be estimated. It also aims to facilitate the dialogue between disciplines involved in a clinical trial.
Even in view of these limitations, the survey clearly indicates the need for basic and advanced statistical trainings and guidance for medical researchers. All medical faculties in Germany have established biostatistical units providing consulting services. However, this does not seem to be sufficient to enable medical researchers to develop protocols, which cover statistical issues adequately. Reasons could be that medical researchers undervalue the impact of an appropriate biometrical planning on the quality and validity of medical studies. In personal discussions with participants of this survey, several of them reported frequent examples where the statistical analysis strategy in study protocols is addressed with a single general sentence like ‘The data are analysed with valid statistical methods.’ This does not only indicate a lack of statistical knowledge but also a lack of awareness that statistical methods have an important impact on the validity of medical research.
The survey also gauges the range of methodological challenges encountered by biostatisticians being member of a medical ethics committee. Often a biostatistician is the last methodological sentinel before a study is implemented in a clinical setting. In order to involve more biostatisticians in medical ethics committees, there is a need to provide support to enable them to adequately discharge their responsibilities. Unfortunately, the survey did not check how the remaining members react on revision requests by biostatisticians and if these requests are adequately addressed before the final vote of the medical ethics committee on the criticised study protocol. Moreover, we did not formally assess if biostatisticians, who are members of medical ethics committees, formulate their requests in comparable detail and persistence. Due to own experiences and based on narrative reports, we suspect that biostatistical concerns cannot be easily communicated to and understood by the non-statistical members of the medical ethics committee. Thus, there is a need to establish a better communication, which allows expressing biostatistical concerns in a convincing easily understandable language.
It is, therefore, time to communicate the general importance of statistics for medical research. This includes the establishment of guidelines for protocol writing and templates like the SPIRIT Statement, which also handles statistical input to a protocol.25 26 Furthermore, the implementation of reporting guidelines like STROBE should be made more popular.27 Moreover, the development of specific trainings and guidance on how to address specific statistical challenges is required. Finally, national standards for the tasks of a biostatistician as a member of a medical ethics committee must be formulated.28 29
Presented at This paper was written on behalf of a recently established joint working group 'Biometry in ethics committees' of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS) e. V. and the German Region of the International Biometric Society. (https://gmds.de/aktivitaeten/medizinischebiometrie/arbeitsgruppenseiten/projektgruppen/biometrie-in-der-ethikkommission/, accessed September 2019).
Contributors GR developed the research idea, designed the survey and wrote the manuscript. LH implemented the online survey, performed the analyses and reviewed the manuscript. UM helped to develop the research idea, reviewed the survey, reviewed the manuscript. IP helped to develop the research idea, reviewed the survey, reviewed the manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available in a public, open access repository. The original dataset of the survey is available from the Dryad repository, https://datadryad.org/stash/share/XbZQ8pXEbP8PXuuAgSPm4sc2jt_HGWRSQRV-PoGcNqs.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.