Article Text

Original research
Medical researchers’ perceptions regarding research evaluation: a web-based survey in Japan
  1. Akira Minoura1,
  2. Yuhei Shimada2,3,
  3. Keisuke Kuwahara4,5,6,
  4. Makoto Kondo7,
  5. Hiroko Fukushima8,9,
  6. Takehiro Sugiyama3,10
  1. 1 Department of Hygiene, Public Health and Preventive Medicine, Showa University School of Medicine, Shinagawa-ku, Japan
  2. 2 Department of Law and Politics, The University of Tokyo, Bunkyo-ku, Japan
  3. 3 Diabetes and Metabolism Information Center, Research Institute, National Center for Global Health and Medicine, Shinjuku-ku, Japan
  4. 4 Department of Epidemiology and Prevention, Center for Clinical Sciences, National Center for Global Health and Medicine, Shinjuku-ku, Japan
  5. 5 Department of Public Health, Yokohama City University School of Medicine, Yokohama, Japan
  6. 6 Department of Health Data Science, Graduate School of Data Science, Yokohama City University, Yokohama, Japan
  7. 7 Department of Anatomy and Neuroscience, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
  8. 8 Department of Pediatrics, University of Tsukuba Hospital, Tsukuba, Japan
  9. 9 Department of Child Health, Institute of Medicine, University of Tsukuba, Tsukuba, Japan
  10. 10 Department of Health Services Research, Institute of Medicine, University of Tsukuba, Tsukuba, Japan
  1. Correspondence to Dr Takehiro Sugiyama; tsugiyama{at}hosp.ncgm.go.jp

Abstract

Objectives Japanese medical academia continues to depend on quantitative indicators, contrary to the general trend in research evaluation. To understand this situation better and facilitate discussion, this study aimed to examine how Japanese medical researchers perceive quantitative indicators and qualitative factors of research evaluation and their differences by the researchers’ characteristics.

Design We employed a web-based cross-sectional survey and distributed the self-administered questionnaire to academic society members via the Japanese Association of Medical Sciences.

Participants We received 3139 valid responses representing Japanese medical researchers in any medical research field (basic, clinical and social medicine).

Outcomes The subjective importance of quantitative indicators and qualitative factors in evaluating researchers (eg, the journal impact factor (IF) or the originality of the research topic) was assessed on a four-point scale, with 1 indicating ‘especially important’ and 4 indicating ‘not important’. The attitude towards various opinions in quantitative and qualitative research evaluation (eg, the possibility of research misconduct or susceptibility to unconscious bias) was also evaluated on a four-point scale, ranging from 1, ‘strongly agree’, to 4, ‘completely disagree’.

Results Notably, 67.4% of the medical researchers, particularly men, younger and basic medicine researchers, responded that the journal IF was important in researcher evaluation. Most researchers (88.8%) agreed that some important studies do not get properly evaluated in research evaluation using quantitative indicators. The respondents perceived quantitative indicators as possibly leading to misconduct, especially in basic medicine (strongly agree—basic, 22.7%; clinical, 11.7%; and social, 16.1%). According to the research fields, researchers consider different qualitative factors, such as the originality of the research topic (especially important—basic, 46.2%; social, 39.1%; and clinical, 32.0%) and the contribution to solving clinical and social problems (especially important—basic, 30.4%; clinical, 41.0%; and social, 52.0%), as important. Older researchers tended to believe that qualitative research evaluation was unaffected by unconscious bias.

Conclusion Despite recommendations from the Declaration on Research Assessment and the Leiden Manifesto to de-emphasise quantitative indicators, this study found that Japanese medical researchers have actually tended to prioritise the journal IF and other quantitative indicators based on English-language publications in their research evaluation. Therefore, constantly reviewing the research evaluation methods while respecting the viewpoints of researchers from different research fields, generations and genders is crucial.

  • MEDICAL EDUCATION & TRAINING
  • Health policy
  • GENERAL MEDICINE (see Internal Medicine)

Data availability statement

Data are available upon reasonable request. Data used in the analysis will be made available to researchers upon request, in compliance with ethical guidelines and with the ethics committee’s approval.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

  • A web-based survey was conducted for Japanese medical researchers in research fields by various medical societies and the Japan Association of Medical Sciences.

  • The questionnaire was developed through focus group interviews with 22 medical researchers from various backgrounds.

  • The subjective importance of quantitative indicators and qualitative factors in evaluating researchers, stratified by the respondents’ characteristics, was demonstrated.

  • The number of responses was limited when compared with the total number of medical researchers in Japan.

  • The design of a web-based self-administered survey could possibly result in bias.

Introduction

Evaluating research is essential for the continuous advancement of scientific progress nationally and internationally.1 Although there is no universal definition, research evaluation refers to the assessment of all research project processes, from the planning of a research project to the dissemination of its results and the development of subsequent research areas.2 Regardless of whether the evaluation is quantitative or qualitative, the research evaluation assesses performance in relation to the research missions or objectives.3 Although researcher evaluation is also an evaluation of researchers, depending on the evaluation objective, it may include their cumulative research activities and non-research activities such as education, professional practice and administration.4

However, some quantitative metrics of scientific outputs, such as the number of English-language publications and the number of citations, are considered important in the allocation of funds and the recruitment of researchers at universities.5 In particular, the journal impact factor (IF), which was originally a measure for journals, not for each paper, has occasionally been used to evaluate the quality of an article or the productivity of a researcher. Actions have been taken to promote responsible research assessment among researchers worldwide as a countermeasure to the trend, symbolised by numerous researchers and organisations signing the Declaration on Research Assessment (DORA), which mainly dissents IF uses for research evaluation.6 Furthermore, the Leiden Manifesto for Research Metrics alarms the pervasive misapplication not only of IF but also of quantitative indicators in general to the evaluation of scientific performance.3 Recently, the Science Council of Japan issued a recommendation regarding research evaluation, stating that quantitative assessment methods should not be overemphasised in research evaluation; they hoped to introduce international trends to help Japanese researchers develop appropriate ways to conduct research evaluations.7

However, the current state of the research evaluation has not yet achieved the stated goal. Indeed, the fourth Medium-Term Plans of National University Corporations, which are required by law to establish Key Performance Indicators to achieve the ministry’s Medium-Term Goals, state that they will measure their performance with a focus on quantitative indicators, including the number of published articles.8 This is true in the field of medicine; combined with the fierce competition for positions as medical researchers, publishing in journals with high IFs is encouraged regardless of differences in fields.9 Consequently, this merit-based evaluation, combined with an overcompetitive environment, puts pressure on researchers to publish, potentially making them more susceptible to research misconduct.10–12

It is important to understand how medical researchers internalise the evaluation axes of their research/researcher and how they interpret the evaluations they receive to address this contentious situation and find a solution. Internationally, in addition to studies aimed at the entire research community,13–15 some studies investigated medical researchers’ perceptions.16–19 Similarly, in Japan, researchers’ perspectives on the evaluation system are discussed.20 21 The problems with current evaluation practices have also been highlighted among domestic medical researchers.22 23 However, no previous research has measured medical researchers’ perceptions of the evaluation of the research/researcher using a large-scale questionnaire to our knowledge.

This study aims to clarify the perceptions of Japanese researchers in medicine regarding research evaluation and extract the problems they face. We conducted a questionnaire survey among medical researchers in the fields of basic, clinical and social medicine to examine the characteristics and issues in the current evaluation axis of medical researchers and identify the evaluation methods that can be considered in the future. Specifically, the study identifies the current state of how medical research and researchers should be evaluated.

Methods

Development of the questionnaire

Figure 1 presents an overview of the survey, and the Method Detail in the online supplemental material describes the details. A team of volunteer junior faculties of the Scientific Committee for the 31st General Assembly of the Japanese Association of Medical Sciences worked on this study. Because the measures on this topic have not been established, we developed a preliminary questionnaire and refined it through focus group interviews (FGIs) with researchers affiliated with member societies of the Japanese Association of Medical Sciences (eg, the Japanese Society of Internal Medicine, the Japan Surgical Society, the Japanese Association of Anatomists and the Japanese Society of Public Health), (non-medical) experts in research evaluation, senior researchers and early career researchers and students.24 FGIs allowed us to extract opinions on research evaluation across disciplines and careers in the Japanese medical field.

Supplemental material

Figure 1

Overview of the survey. This study was conducted in two phases: a survey to improve and finalise the questionnaire (based on focus group interviews and receiving comments) and the implementation of our web-based survey. The details are described in the Method Detail in the online supplemental material.

Survey design

After each interview, we reviewed and revised the questionnaire based on the participants’ opinions. Using the revised questionnaire, we conducted a web-based survey of medical researchers. This survey was requested to be announced and distributed to members of medical academic societies through these societies from the Japanese Association of Medical Sciences. However, the survey announcement was voluntary; the method of announcement varied between societies (eg, email newsletter and notification on the society’s website); some societies may not have sent the announcement to their members. The organisation represents the entire Japanese medical research community, ensuring the broadest possible reach to the medical researchers who are the focus of our research. In Japan, in anticipation of the increasing sophistication of medical care and the decrease in the number of medical personnel due to the declining birthrate, the way doctors work will undergo major legal changes in 2024, and medical researchers are becoming increasingly interested in research evaluation. On the survey website, after explaining the present survey, those who did not consent to the study or those whose daily work (ie, work before going on maternity leave and childcare leave) is not related to research were asked to leave the website and therefore excluded. The survey period was from December 14, 2022, to January 17, 2023.

Statistical analyses

The survey involved a self-administered questionnaire to obtain responses on the evaluation’s current status and issues regarding how medical research and researchers were evaluated. In addition to the descriptive statistics, to reveal the difference in local situations, cross-tabulation was calculated stratified by various factors such as age, gender, position and family situation. Other variables indicate the characteristics of the respondents. To efficiently analyse the results, we summarised the characteristics of respondents into fewer classifications. Questionnaires with the same answers to all questions were considered invalid and were excluded. To confirm the robustness of the results of cross-tabulation, we additionally examined the adjusted values. Ordered logistic regression was used to adjust for factors of gender, research field and age and to calculate the predicted percentage of each answer. We excluded the ‘I do not know’ responses in the adjusted analyses. The methods used have been detailed in the online supplemental material. Descriptive analyses were conducted using QuickCross (Macromill, Inc., Minato-ku, Tokyo, Japan), and statistical analyses were conducted using Stata 17.0 (Stata Corp., TX, USA).

Ethical considerations

The study protocol was approved by the Institutional Review Board of the National Centre for Global Health and Medicine (NCGM-S-0 04 530–01).

Patient and public involvement

None.

Results

A total of 3169 researchers answered the questionnaire during the survey period; 386 respondents either did not consent to participate or declined because research activity was not a part of their job. Among the responses, 30 were excluded because they had invalid answers; thus, the analysis included 3139 researchers (2244 men, 852 women and 43 others). The response rate could not be calculated because the number of potential respondents (ie, medical researchers who received the survey announcement) was unknown. Table 1 shows the characteristics of the participants, whereas online supplemental table S1 presents more comprehensive descriptive analyses of the survey answers. By academic rank, professor level (eg, professors, directors of clinical departments or directors of research laboratories) was predominant (n=1213, 38.6%), and by employment status, full-time (tenured) employees were the most common (n=2009, 64.0%). Regarding effort for research, 33.3% (n=1048) of researchers answered that research work accounted for half or more of their work time.

Table 1

Characteristics of survey participants

For quantitative indicators in evaluating researchers (figure 2 for stratified results, online supplemental table S1 Q3-1 for overall results and online supplemental table S3 Q3-1-1, Q3-1-2 and Q3-1-4 for details of stratified results), 67.4% answered that the journal IF was considered important (especially important, n=616, 19.6%; important, n=1501, 47.8%), notably more in basic medicine than in clinical and social medicine, among younger researchers than older researchers and among men than women. Compared with respondents with no medical license, physicians and other healthcare professionals were more likely to respond that IFs are important. The number of papers published in English-language journals was considered more important (especially important, n=1045, 33.3%; important, n=1625, 51.8%) than those published in Japanese-language journals (especially important, n=106, 3.4%; important, n=947, 30.2%). The preference for English-language journals over Japanese-language journals was more pronounced in basic medicine than in clinical and social medicine, in younger researchers (39 years old or lower) than older researchers (60 years old or older) and in men than in women. Online supplemental table S4 shows the results of cross-tabulations stratified by license and education to observe how the effect of education differed by the medical profession. Physicians and dentists with only an MD or doctor of dental surgery and those with a PhD placed more importance on the IF in research evaluation (online supplemental table S4 Q3-1-4). Quantitative indicators were observed to be valued higher in the order of doctor, master and undergraduate scales for respondents with other medical licenses.

Figure 2

Quantitative indicators in evaluating the surrounding researchers. Each colour represents the percentage of answers. The left, middle and right columns are the cross-tabulation results by research field, age and gender. The exact values are shown in online supplemental table S3 (Q3-1-1, Q3-1-2 and Q3-1-4 for tables stratified research field, age and gender). Note: the terms ‘clinical’, ‘basic‘ and ‘social’ refer to clinical, basic and social medicine. ‘yo’ means ‘years old’

Regarding the qualitative factors in evaluating researchers (figure 3 for stratified results, online supplemental table S1 Q3-2 for overall results and online supplemental table S3 Q3-2-2, Q3-2-3 and Q3-2-5 for details of stratified results), the originality of the research topic (especially important, n=1159, 37.0%; important, n=1614, 51.5%) and contribution to the advancement of science (especially important, n=1172, 37.4; important, n=1485, 47.4%) were considered more important than the exhaustiveness of analyses (ie, the degree that necessary analyses are thoroughly performed) (especially important, n=392, 12.6%; important, n=1612, 51.6%). The originality of the research topic was considered more important in basic medicine than in clinical and social medicine (especially important—basic, n=321, 46.2%; social, n=223, 39.1%; and clinical, n=541, 32.0%), whereas the contribution to solving clinical and social problems was considered more important in social medicine than in basic and clinical medicine (especially important—basic, n=211, 30.4%; clinical, n=692, 41.0%; and social, n=295, 52.0%).

Figure 3

Qualitative factors in evaluating the surrounding researchers. Each colour represents the percentages of answers. The left, middle and right columns are the cross-tabulation results by research field, age and gender, respectively. The exact values are shown in online supplemental table S3 (Q3-2-2, Q3-2-3 and Q3-2-5 for tables stratified research field, age and gender). Note: the terms ‘clinical’, ‘basic’ and ‘social’ refer to clinical, basic and social medicine. ‘yo’ means ‘years old’.

Figure 4 illustrates the researchers’ perceptions of quantitative indicators and qualitative factors for research evaluation. Most researchers (88.8%) agreed that some important studies do not get properly evaluated in research evaluation using quantitative indicators, especially in basic and social medicine, among the 40–49 age group and men. The use of quantitative indicators was perceived to possibly lead to misconduct for researchers in basic medicine compared with those in clinical and social medicine (strongly agree—basic: 22.7%, clinical: 11.7%, social: 16.1%). Older researchers tended to consider that qualitative research evaluation was not affected by unconscious bias compared with younger researchers. Furthermore, online supplemental table S3 stratified by academic rank (Q3-4-8, Q3-5-2) revealed that respondents at the professor level were less likely to believe that focusing on quantitative indicators would lead to underestimation of non-research activities (eg, education, clinical practice and social activities) and were unaware of susceptibility of qualitative evaluation to biases caused by interpersonal relationships or unconscious biases.

Figure 4

Researchers’ perceptions of quantitative indicators and qualitative factors for research evaluation. Each colour represents the percentages of answers. The left, middle and right columns are the cross-tabulation results by research field, age and gender, respectively. The exact values are shown in online supplemental table S3 (Q3-4-5Q-4-7and Q-5-2 for tables stratified research field, age and gender). Note: the terms ‘clinical’, ‘basic’ and ‘social’ refer to clinical, basic and social medicine. ‘yo’ means ‘years old’.

Online supplemental table S5 shows the proportion for each choice, adjusted for gender, research field and age category, using ordered logistic regression analysis. These tables suggest that the main results shown in figures 2–4 were not explained solely by confounding.

Regarding DORA recognition, only 10.1% of the respondents knew its contents, whereas 28.8% knew the name but not the contents (online supplemental table S1 Q4-3). In other words, 61.1% were unfamiliar with the name DORA. Given that DORA recognition represents evaluation knowledge, this variable’s stratified results can be interpreted as the effect of evaluation literacy. Researchers who recognised the DORA tended to place slightly less emphasis on the importance of the IF (online supplemental table S3 stratified by research evaluation literacy Q3-1-4). Among them, those who knew its contents were likely to value the qualitative factors such as the originality of the topic or methodology (Q3-2-1 and Q3-2-2), contribution to the advancement of science (Q3-2-3) and contribution to clinical and social problem-solving (Q3-2-4). Furthermore, they also agreed that some important studies do not get properly evaluated by quantitative indicators (Q3-4-5), quantitative indicators may lead to research misconduct (Q3-4-7), and the validity of the qualitative evaluator’s (ie, reviewer’s) assessments should be evaluated (Q3-5-4).

We obtained a total of 645 responses for open-ended inquiries. We classified these responses into seven categories. Out of the 232 survey responses, 37 recommended public reporting of results, 34 suggested incorporating results into policy, 77 highlighted survey problems and criticisms, and 91 expressed positive attitudes towards the survey. The responses that mentioned activities other than work research (n=84) included education, clinical practice, social activities, administrative work and peer review. We also received responses regarding institutional-environmental conditions for research evaluation (n=77), and problems were identified, such as differences in fields, evaluators’ abilities and the amount of available financial and human resources. Regarding the nature of the indicators (n=69), opinions were divided into two groups: 32 criticised and 26 supported the quantitative indicators. Online supplemental table S6 contains additional categories (activities outside of work (n=11), structural conflicts between valuable research and evaluation (n=11) and evaluation fatigue (n=7)), subcategories and examples.

Discussion

Summary findings

The web-based survey yielded two major findings. First, it was discovered that the majority of medical researchers in Japan, particularly those in basic medicine, young researchers and men, believe that IF and other quantitative indicators based on English publications are appropriate for assessing researchers. Second, medical researchers’ perceptions of quantitative and qualitative indicators in evaluating medical research and researchers varied depending on the participants’ characteristics, such as research field, age and gender. When evaluating researchers, basic medicine researchers were more likely to consider the number of articles published in English-language journals, the journal’s IF and the originality of the research topic. Meanwhile, more social medicine researchers than other medical researchers believed that the number of articles published in Japanese-language journals and the contribution to the resolution of clinical and social problems were important. To our knowledge, this is the first study to clarify perceptions of research/researcher evaluation among medical researchers in Japan.

Reliance on quantitative indicators derived from English-language publications

The general tendency to emphasise IF across disciplines deviates significantly from the DORA, while it recommends against using IFs for research evaluation. Not only IF but also other quantitative indicators based on English-language publications were also regarded as significant factors for evaluation, which was the situation the Leiden Manifesto is concerned with. This result was widely observed among respondents, as 67.4% placed importance on the IF and 85.1% placed importance on the number of English-language journals (online supplemental table S3 Q3-1). It demonstrates that metrics play an important role in the evaluation system among the Japanese medical research communities as a whole. To advance current evaluation practices, we must approach the entire medical community rather than a specific group. Although many researchers or research institutes in Japan use IF as a metric for study importance or for a researcher’s productivity (eg, by adding the IFs of the journals in which they published papers), it should be noted that the IF was originally designed to measure the influence of a scientific journal, rather than the quality of the research or the researchers.5

In addition to the general over-reliance on these metrics, there are attribute-specific trends in the preference for quantitative measures. Younger researchers were likely to refer to IF and other quantitative indicators based on English-language publications, perhaps because many hold fiercely competitive positions; they tended to internalise the widely used evaluation metrics. This tendency to place importance on the evaluation axes in the form of published papers is possibly compatible with the study’s results; among the qualitative factors for evaluating researchers, the exhaustiveness of analyses was rated lower than the originality of the research topic and contribution to the advancement of science. This may be due in part to the requirement to present through the medium of a paper that demands conciseness rather than exhaustiveness.25 The importance of IF in research evaluation was not significantly affected by knowledge of DORA. However, it was linked to higher rates in qualitative factors such as research topic originality. As only 10% of participants claimed to be familiar with DORA and its contents, advocating for and supporting these activities and statements may influence perceptions of research evaluation.

Variety of research evaluation axis by attributes

Remarkably, the evaluation axes differed among research fields, age groups, genders and other subcategories. Researchers of basic medicine tend to rate IFs and the number of papers published in English-language journals higher and the number of papers in Japanese-language journals lower. This may be because the research product in basic medicine is often applicable in any country; thus, it is reasonable to publish information in English. In addition, basic medicine is more susceptible to funding shortages due to maintenance cost for laboratory equipment so that researchers in this field may need to generate well-evaluated outputs. However, this may not be the case in clinical and social medicine; the main readers of their research products may also be clinicians or policymakers who may not be well versed in English.26 Clinical and social medicine, in contrast to basic medicine, sometimes focus on the domestic context, which is separate from international journals.23 Furthermore, researchers in clinical or social medicine are expected to engage in a variety of tasks in addition to writing papers in English (eg, clinical practice, guideline development and social practice). Therefore, it is not easy to establish a universal evaluation axis across research fields.

Furthermore, the difference in perception by age and academic rank may partly represent the contrast between evaluators and those who are evaluated. For example, while older and professor-level researchers placed less importance on quantitative indicators, they tended to be unaware of the risk of unconscious biases derived from qualitative evaluation. Although senior researchers frequently evaluate junior researchers and therefore have the authority to determine evaluation axes, discussions about research evaluation between age groups can foster mutual understanding and enhance young researchers’ capacity and responsibility to take on future research fields.27 28

Mild differences were also observed by gender, such as women placing less emphasis on IF than men, which persisted after adjustment for covariates; thus, gender diversity should be considered when discussing research evaluation. Meanwhile, the low number of women in management positions29 and difficulties in maintaining work-life balance, which is expected to improve in the near future, may have contributed to reducing the observed differences based on gender in these results. Regarding profession and education, respondents with medical license generally tended to respond IFs are important; it is interesting that the effect of graduate education was heterogeneous between physician/dentist and other medical professionals.

Implication of the study for further consensus

This study did not set out to find a new indicator for research evaluation. Although we found that the situation in Japan differs from what the DORA and the Leiden Manifesto aim for, many researchers may become unsure which evaluation axis to use and require alternative research evaluation axes to rely on.

One possible solution is to develop more reliable evaluation criteria. In fact, "field weighted citation impact (FWCI)" and "top 10% of highly cited papers" compensate for differences in disciplines better than the (unadjusted) journal IF.30 31 This may help alleviate differences between disciplines, such as basic, clinical and social medicine. The H-index takes into account both the number and impact of articles written by a researcher.32 Additionally, advanced metrics known as altmetrics, which measure social impact, are being developed. In the age of open-access journals and social networking services, efforts to establish metrics for medical research evaluation should continue. However, it is difficult to develop a definitive metric. For example, the FWCI or top 10% of highly cited papers cannot fully account for the differences between research fields; citations may not be the best indicator of impact in some fields. The H-index, which is influenced by the researcher’s academic age and research field, should be used with caution. Actually, its excessive use was questioned by scientometricians and resulted in the publication of the Leiden Manifesto.3 33 Moreover, the development of a definitive metric does not imply that we can stop thinking about it, because once a metric is established, it becomes self-objective, undermining efforts towards overall optimisation.

Rather than looking for better metrics, we must accept the limitations of quantitative indicators and share the understanding that quantitative indicators should only be used in conjunction with qualitative evaluation. Even though it is difficult to evaluate research/researcher solely in qualitative manner as academic disciplines get specialised and subdivided, it is important to conduct the research/researcher evaluation based on a deeper qualitative assessment in a balanced manner.3 6 One of the limitations of qualitative evaluation is its time-consuming nature. It is advantageous to reach an agreement on the importance of this time-consuming process and to shorten the time required for qualitative evaluation. Another limitation is the transparency of the assessment’s basis. It is desirable to reconsider which aspects of research/researcher should be valued, namely, the research/researcher’s mission, within each community or organisation, as well as to clarify the evaluation objective.3 34

Strengths and limitations

This study conducted a nationwide survey with the assistance of the Japanese Association of Medical Sciences. It serves as the umbrella organisation for all medical academic societies in Japan. This design allowed us to reach our target population (ie, medical researchers in Japan) as much as possible. Finally, we received 3139 valid responses from medical researchers in Japan, which improved the analysis’s robustness.

The study’s limitations include the use of a self-administered web-based survey. Furthermore, although the present sample of over 3000 responses produced robust results, the survey was only completed by a subset of medical researchers. Those who are initially interested in research evaluation are more likely to complete the survey, which may lead to bias in the results. The respondents’ gender imbalance was obvious, and it appeared to reflect the basic gender gap that exists among Japanese doctors and medical researchers,29 35 which we and others regard as a problem in and of itself.36 37 Despite sampling limitations, this study is the first to examine how medical researchers perceive the evaluation of research and researchers nationwide in Japan. This study’s results are expected to improve researchers’ evaluation methods and, in turn, their research performance.

The primary analyses (shown in figures 2–4) focused on stratification by age groups, gender and research fields that were three characteristic variables out of eight listed in Table 1. Meanwhile, cross-tabulation results for all variables were described in online supplemental table S3. Future studies should explore the relationships between these variables in greater depth. For example, the effect of academic rank and age cannot be completely separated.

Conclusions

Although most medical researchers in Japan refer to IF and other quantitative indicators based on English paper publications for evaluating researchers, the ideal evaluation axes differ across research fields, generations and genders. We believe it is important to assess research evaluations and constantly review whether there is room for improvement while respecting different ideas from every research field, generation and gender.

Data availability statement

Data are available upon reasonable request. Data used in the analysis will be made available to researchers upon request, in compliance with ethical guidelines and with the ethics committee’s approval.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and was approved by the Institutional Review Board of the National Center for Global Health and Medicine (NCGM-S-004530-01), and the study protocol was approved on 9 December 2022. Participants gave informed consent to participate in the study before taking part.

Acknowledgments

The authors appreciate the interview and survey respondents’ time and effort. For their unwavering support of the research, the authors thank the members of the committee of junior faculties (U40 Committee) of the Scientific Committee and executive members of the 31st General Assembly of the Japanese Association of Medical Sciences. The authors appreciate the questionnaire’s review and distribution by committee members of the Japanese Association of Medical Sciences and the Japanese Association of Medical Sciences Coalitions. The authors are grateful to the members of the Young Academy of Japan and the Science Council of Japan for several productive discussions. The authors also thank Dr Kenjiro Imai, Dr Noriko Ihana-Sugiyama and Ms Akiko Kimura-Wakui for supporting project administration. The authors extend their gratitude to Dr Takahiro Higashi, Dr. Yoshiharu Fukuda and Dr Hideaki Shiroyama for their insightful advice. Finally, the authors appreciate the assistance provided by Dr Kenkichi Takase, Dr Amane Koizumi and Dr Kazuhiro Hayashi throughout the research.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors Conceptualisation: AM, KK, MK, HF, TS. Methodology: AM, YS, KK, MK, HF, TS. Investigation: AM, YS, KK, MK, HF, TS. Visualisation: AM, YS, KK, MK, HF, TS. Funding acquisition: AM, KK, MK, HF, TS. Project administration: AM, TS. Supervision: TS. Writing – original draft: AM, YS, TS. Writing – review and editing: AM, YS, KK, MK, HF, TS. All authors have conducted the following: (1) substantial contributions to the conception or design of the work or the acquisition, analysis or interpretation of data for the work; (2) drafting the work or reviewing it critically for important intellectual content; (3) final approval of the version to be published; and (4) agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

    TS is the guarantor of this work and, as such, has full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

  • Funding This work was supported by the 53rd Kurata Grants (the Hitachi Global Foundation) in Humanities and Social Sciences 'Reconsideration of evaluation criteria for medical research and researchers aiming for better medical care from the standpoint of young medical researchers' (No. 1523).

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.