Objective Since most biomedical research focuses on a specific disease, evaluation of research output requires disease-specific bibliometric indicators. Currently used methods are insufficient. The aim of this study is to develop a method that enables detailed analysis of worldwide biomedical research output by disease.
Design We applied text mining techniques and analysis of author keywords to link publications to disease groups. Fractional counting was used to quantify disease-specific biomedical research output of an institution or country. We calculated global market shares of research output as a relative measure of publication volume. We defined ‘top publications’ as the top 10% most cited publications per disease group worldwide. We used the percentage of publications from an institution or country that were top publications as an indicator of research quality.
Results We were able to classify 54% of all 6.5 million biomedical publications in our database (based on Web of Science) to a disease group. We could classify 78% of these publications to a specific institution. We show that between 2000 and 2012,‘other infectious diseases’ were the largest disease group with 337 485 publications. Lifestyle diseases, cancers and mental disorders have grown most in research output. The USA was responsible for the largest number of top 10% most cited publications per disease group, with a global share of 45%. Iran (+3500%) and China (+700%) have grown most in research volume.
Conclusions The proposed method provides a tool to assess biomedical research output in new ways. It can be used for evaluation of historical research performance, to support decision-making in management of research portfolios, and to allocate research funding. Furthermore, using this method to link disease-specific research output to burden of disease can contribute to a better understanding of the societal impact of biomedical research.
- health economics
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Contributors LvdL and NH made the definitions of disease groups, categorised the author keywords and made the disease-specific keywords. TdK, NH and LvdL performed the analysis. LvdL wrote the manuscript together with NH and TdK. NH and LvdL validated the results with researchers and deans. LW implemented the text mining algorithm, assigned the publications to disease groups and calculated the bibliometric statistics. The Centre for Science and Technology Studies (CWTS) at Leiden University provided the cleaned address data for the universities, hospitals and public research organisations included in the study. IM, LW and AG provided feedback on the manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient consent Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Technical appendix can be provided. The appendix includes a definition of biomedical research by WoS research fields.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.