Article Text

Accuracy and readability of cardiovascular entries on Wikipedia: are they reliable learning resources for medical students?
  1. Samy A Azer,
  2. Nourah M AlSwaidan,
  3. Lama A Alshwairikh,
  4. Jumana M AlShammari
  1. Department of Medical Education, Curriculum Development and Research Unit, College of Medicine, King Saud University, Riyadh, Saudi Arabia
  1. Correspondence to Professor Samy A Azer; azer2000{at}optusnet.com.au

Abstract

Objective To evaluate accuracy of content and readability level of English Wikipedia articles on cardiovascular diseases, using quality and readability tools.

Methods Wikipedia was searched on the 6 October 2013 for articles on cardiovascular diseases. Using a modified DISCERN (DISCERN is an instrument widely used in assessing online resources), articles were independently scored by three assessors. The readability was calculated using Flesch-Kincaid Grade Level. The inter-rater agreement between evaluators was calculated using the Fleiss κ scale.

Results This study was based on 47 English Wikipedia entries on cardiovascular diseases. The DISCERN scores had a median=33 (IQR=6). Four articles (8.5%) were of good quality (DISCERN score 40–50), 39 (83%) moderate (DISCERN 30–39) and 4 (8.5%) were poor (DISCERN 10–29). Although the entries covered the aetiology and the clinical picture, there were deficiencies in the pathophysiology of diseases, signs and symptoms, diagnostic approaches and treatment. The number of references varied from 1 to 127 references; 25.9±29.4 (mean±SD). Several problems were identified in the list of references and citations made in the articles. The readability of articles was 14.3±1.7 (mean±SD); consistent with the readability level for college students. In comparison, Harrison’s Principles of Internal Medicine 18th edition had more tables, less references and no significant difference in number of graphs, images, illustrations or readability level. The overall agreement between the evaluators was good (Fleiss κ 0.718 (95% CI 0.57 to 0.83).

Conclusions The Wikipedia entries are not aimed at a medical audience and should not be used as a substitute to recommended medical resources. Course designers and students should be aware that Wikipedia entries on cardiovascular diseases lack accuracy, predominantly due to errors of omission. Further improvement of the Wikipedia content of cardiovascular entries would be needed before they could be considered a supplementary resource.

  • MEDICAL EDUCATION & TRAINING

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • Standardised quality and readability tools were used to answer the research questions.

  • The study solely focused on 47 adult cardiovascular diseases and only those entries in the English language.

  • The results cannot be generalised to other medically related topics on Wikipedia.

  • The work was not blinded. Textbooks are not free from possible errors or differences in their contents.

  • Wikipedia is not intended to be used as a textbook for medical students.

Introduction

With the introduction of integrated medical curricula and self-directed learning, students need to search for their learning issues using a wide range of learning resources.1 ,2 These changes aim at encouraging active learning and moving away from passive learning or relying on the teacher as the main source of information. Student-centred learning approaches are particularly recommended in medical schools because of the rapid proliferation of medical information.3 ,4 Therefore, medical students need to gain and update their information continuously, and master the skills on how to search for knowledge rather than limit their learning to lectures or the content of a particular textbook.1

Owing to the increasing use of social web, also known as web 2.0, the way information is produced, shared and used has changed.5 Medical students are increasingly relying on the Internet and websites such as Google and Wikipedia in searching for information.2 Recently, one study showed that one-third of college students use Wikipedia for academic learning.6 Guarino et al7 showed that Wikipedia significantly surpassed the peer-reviewed medical databases as their preferred learning resource. Another study highlighted the increasing use of Wikipedia among dental students2 and surgical residents.8 This tendency may be related to the easy access. People most often start by searching Google; Wikipedia is usually among the top search results on Google.9 Wikipedia is also well known to most online users. Furthermore, Wikipedia is available at any time and from anywhere; users only need a computer or a smart phone connected to the Internet to search Wikipedia.5 Other minor advantages for the reliance of students on web-based resources such as Wikipedia may include: (1) information provided online usually address different aspects of an issue and use a range of resources to explain it such as images, videos, diagrams and entries,5 and (2) information provided online is more frequently updated than are textbooks.10 These factors may play a role in making students prefer online resources rather than textbooks.

Wikipedia, created in 2001, is a free multilingual online encyclopedia that provides entries about almost any topic, and is based on an openly editable model.9 ,11 ,12 There have been 4 880 039 entries published up to 28 May 2015 in the English language alone, with a total of 36 373 672 pages, and an increase of 197 138 articles during the proceeding year.13 ,14 It is interesting to note here that Wikipedia has created this number of entries without any budget or editorial team. The reason for the high number of Wikipedia entries is that anyone can write an entry (an article); it is entirely based on volunteers from different countries, who have worked cooperatively to create, edit and update articles.

Interestingly, searching for information by using a search engine such as Google or Yahoo has shown Wikipedia appearing in the first 10 results.9 As of May 2015, the number of views worldwide was over 453.10 million,15 and the number of medical and health-related articles was over 33 000 entries.13 ,14

Although a good number of research publications examined the reliability of Wikipedia: (1) as an educational resource for patients,16 nursing students,17 clinical pharmacy education,18 drug information19 and veterinary medicine,20 and (2) for gathering information on gastrointestinal and liver diseases,21 respiratory diseases22 and the nervous system,23 there have been no studies examining the scientific accuracy and readability level of the Wikipedia entries on cardiovascular diseases, and whether the entries can be a reliable learning resource for medical students. Therefore, the aim of this study was to assess the scientific accuracy of the Wikipedia entries on cardiovascular diseases and whether they can be a suitable learning resource.

Methods

Study design

Five standardised medical textbooks were used as a reference for identifying the cardiovascular topics to be searched on Wikipedia. These textbooks (box 1) were selected because they are widely recommended in undergraduate medical courses in most medical schools and also recommended by regulatory authorities such as the Australian Medical Council in Australia (http://www.amc.org.au/publications#medicine), the General Medical Council in UK (http://www.gmc-uk.org/doctors/plab/23448.asp), and the Royal Colleges such as the College of Physicians and Surgeons of Canada (https://www.cpsbc.ca/content/exam-preparation), for examination preparation. They are written by experts and clinical teachers in different specialties in medicine, are regularly updated and reviewed, and were reviewed by scholars in prestigious medical journals such as the Journal of the American Medical Association, British Medical Journal and The New England Medical Journal.24–27 The aims of using these textbooks were to: (1) identify the cardiovascular topics to be evaluated, and (2) use these textbooks as a reference resource during the assessment of the content of the Wikipedia entries. The strategy for searching the five cardiovascular chapters included: (1) identifying key topics and the content under each topic, and assessing the emphasis made, and (2) identifying topics that shared maximum emphasis across the five chapters. Each evaluator searched these chapters independently and the outcomes were discussed in a meeting. Topics shared among the researchers were further discussed and topics covering rare diseases or rare syndromes were not included. In fact, rare diseases included in Wikipedia were evaluated in an elegant study.9 The final list comprised 47 Wikipedia entries covering issues commonly included in undergraduate medical programmes. Another source for identifying the 47 topics was the author's long experience in medical school curricula at the University of Sydney and University of Melbourne in Australia as well as other detailed curricula for undergraduate medical schools such as curriculum verification reports (University of California—San Francisco School of Medicine; http://www.aamc.org/download/363632/data/sampleverifreprtucsf.pdf). Also, the recommendations of the report of Good Medical Practice (2013) issued by the General Medical Council (http://www.gmc-uk.org/guidance/good_medical_practice.asp) and the guidelines to medical schools issued by the Australian Medical Council (http://www.amc.org.au/accreditation/primary-medical-education).

Box 1

Medical textbooks used as a reference in evaluating Wikipedia articles

Andreoli TE, Benjamin IJ, Griggs RC, et al. Andreoli and Carpenter's Cecil Essentials of Medicine, 8th Edition. Philadelphia: Saunders, 2010.

Colledge NR, Walker BR, Ralston SH. Davidson's Principles & Practice of Medicine, 21st Edition. Edinburgh: Elsevier, Churchill Livingstone, 2010.

Kumar P, Clark M. Kumar and Clark's Clinical Medicine, 8th Edition. Edinburgh: Elsevier, 2013.

Longo DI, Fauci AS, Kasper DL, et al. Harrison's Principles of Internal Medicine, 18th Edition. New York: McGraw-Hill, 2012.

McPhee SJ, Papadakis MA, Rabow MW. 2011 Current Medical Diagnosis & Treatment, 50th Anniversary Edition. New York: McGraw-Hill, Lange, 2011.

Searching Wikipedia

The topics identified were used as keywords in searching Wikipedia (http://www.wikipedia.com). The search was conducted on the 6 of October 2013 by three evaluators independently, and entries identified were printed out on the same day. The predetermined 47 topics on common cardiovascular disorders were all found to have a corresponding article on Wikipedia. Each researcher received a copy of the 47 Wikipedia entries for further evaluation. The reason for printing out copies was to ensure that all researchers evaluate the same version of a particular entry. This is important, since Wikipedia entries are constantly being updated and changing.14

Assessing the accuracy and depth of coverage

To assess the accuracy of the content of Wikipedia entries, a modified version of the DISCERN instrument was used. The original DISCERN is a standardised set of criteria for judging the quality of health information written for the public on treatment options.28 The DISCERN instrument was created by the University of Oxford, and the project was funded by the British Library and the National Health Service (NHS) Research & Development Programme.28 It consists of 15 questions plus an overall quality-rating question (http://www.discern.org.uk/discern.pdf). The instrument has been used to assess healthcare-related websites and online resources. For example, the quality of patient information on surgical treatment of haemorrhoids,29 colorectal cancer information30 and urology-related patient information.31

However, the original version of the DISCERN instrument was not suitable for evaluating online resources such as Wikipedia entries; it did not examine whether the information provided is scientifically correct and in agreement with current valid resources, and whether there were gaps or scientific errors in the given information. The original DISCERN instrument also lacked the evaluation of images, figures and tables that support information provided in the article and enhance understanding of the topic discussed. These deficiencies urged the need to use the modified DISCERN version, which has been tested and used in earlier publications.21–23 The modified DISCERN comprised 10 questions. The questions covered the aims and objectives of the topic, the scientific accuracy of the content and whether the content was neutral (not based on personal views of authors); they also assessed whether the contents allocated to each part or each subtitle were distributed in a balanced way, and examined the clarity of content, frequency of updating and the quality of images, illustrations and tables, and whether they add value and enhance understanding of the topic. The final question was on the overall rating of the article (entry) (see online supplementary appendix S1). Each question was rated on a 5-point scale. One corresponds to ‘NO, if absent or not addressed’, three corresponds to ‘partially addressed, addressed but not in an adequate or satisfactory level’, and five corresponds to ‘Yes, adequately addressed’. For question 10, a score of 1 was given if the answer was “extensive shortcomings, for example, the article was not completed and/or lacked key issues, and/or had several scientific errors or contradictory statements’, 3 was given if the answer was “potentially important but not extensive shortcomings, for example, the article was completed but had one major deficiency that may limit its educational value’ and 5 was given if the answer was “minimal shortcomings, for example, the article had minor shortcomings that will not interfere with its overall educational value’. The maximum score for the modified DISCERN is 50 and the minimum score is 10. In order to make the results of total scores meaningful, cut-off points of the total score have been used where a total score of 40 or above is described as Good, a score of 30 to 39 is described as Moderate and a score less than 30 is described as Poor.

Piloting the work

Before using the modified DISCERN to assess the 47 Wikipedia entries, the work was piloted. The aims of piloting were: (1i) familiarise the evaluators with the scoring system of the DISCERN instrument, (2) discover sources of disagreement among evaluators and discuss ways to minimise disagreement and (3) enable the evaluators to master the use of the instrument. To conduct the pilot phase, it was decided to choose 10 entries comparable to, but other than, the 47 entries included in the study. Each evaluator was asked to apply the criteria independently. Once the evaluation was completed, the outcomes were discussed in a meeting. Wikipedia entries that were scored differently by different evaluators were discussed until an agreement was reached. Another 10 entries were re-evaluated independently. After two rounds of piloting the study, the agreement among evaluators was in the range of 80–90%, indicating readiness to assess the articles included in the study.

Conducting the study

Along the same methodological approach, the 47 Wikipedia entries were evaluated independently using the modified DISCERN instrument. Scores were then placed on an Excel sheet (Microsoft Excel for Mac 2011, V.14.4.1, Microsoft Corp, Redmond, Washington, USA). The agreement between the evaluators was measured using the Fleiss κ scale.32–34

Assessing the references

Since one of the parameters of assessing an academic and scholarly work rests on assessing the references that were used in creating the work, it was decided to assess the references of each Wikipedia entry. The assessment included the following: (1) total number of references, (2) number of peer-reviewed journals, (3) number of peer-reviewed journals in the past 5 years, (4) educational/procedures guidelines, (5) textbooks, (6) professional websites, (7) general websites and (8) news and media. Evaluators conducted the evaluation of references independently.

Frequency of updating Wikipedia entries

Owing to the continuing changes and discoveries in medical and scientific fields, it is indispensable to assess the rate of updating of cardiovascular articles. The frequency of updating each article was assessed through a revision of the ‘history page’ on Wikipedia. The page includes detailed history such as the date of creation of the article, the number of authors and the frequency of updates, and the changes made to the article since its creation. The aim was to assess the number of updates/revisions made up to the date of printing out the copies (6 October 2013) and calculate the number of updates in the 12 months prior to the printing. Assessing the outcomes of such revisions and editing was examined in light of a few examples of Wikipedia articles, regarding their adequacy of knowledge and scientific accuracy.

Assessing readability

To test the hypothesis of whether the articles were written at the college student's readability or the public level, it was decided to calculate the readability level of each article. Using the free online readability instrument provided by ‘Readability Formulas’ (http://www.readabilityformulas.com/), the Flesch-Kincaid Grade Level tool was used to calculate readability;35 the measurement has been widely used in a number of studies to calculate reading levels.16 ,2123 ,30 ,35–37

The formula for the Flesch-Kincaid Grade Level score is:Embedded Image

Where: ASL is the average sentence length (the number of words divided by the number of sentences) and ASW is the average number of syllables per word (the number of syllables divided by the number of words). This tool evaluates text based on US school grade levels. For instance, a score of 10 indicates the reading level of a 10th grade student, and scores in the range of 14 to 16 indicate the reading level of college students. To calculate the readability score, each evaluator randomly selected and copied a sample of text between 150 and 600 words from the beginning, middle and end of each article. The text was placed into the online calculator and the readability score obtained for each input. Headings, external links and images were excluded. The mean readability score and SD were calculated for each article.

Assessing medical textbooks

The aim was not to compare textbooks with Wikipedia articles because Wikipedia is not written for medical students and is not aimed to be an academic resource. However, it was decided to assess one textbook, Harrison's Principles of Internal Medicine, 18th edition, in regard to the number of images, illustrations, graphs, tables, references as well as its readability, using the Flesch-Kincaid Grade Level. This assessment is limited to the corresponding 47 cardiovascular articles. The DISCERN instrument was not applied to the textbooks because textbooks are used in this study as a resource reference for comparison. Furthermore, the only reference to test the scientific accuracy of textbooks is the medical literature; a task outside the scope of this study. While most textbooks in their paper format are regularly reviewed and new editions released every 4–5 years, most textbooks have introduced online resources to accompany the textbooks. For example, the CD, Access Medicine (Accessmedicine.com), is provided with the Harrison Textbook, along with online resources, images, tables and figures. The other four books are linked with Expert Consult Online, which provides users with updates and additional resources. These online resources are regularly updated with indications of when they were updated.

Statistical analysis

The final scores for accuracy and readability of each article were collected from each evaluator and placed on an Excel spreadsheet. Analyses for the median, IQR, mean and SDs as well as correlation studies were performed using SPSS (V.22.0 for Mac OS, SPSS Inc, Chicago, Illinois, USA). The agreement between evaluators was calculated using the Fleiss κ scale.

Results

Overall evaluation of Wikipedia entries

In the template created by Wikipedia, each article has a table of contents to guide the reader. In addition, an ‘infobox’ displays the International Classification of Diseases (ICD-10, ICD-9), Medical Subject Headings (MeSH) codes, Online Mendelian Inheritance in Man (OMIM) entry and links to the topic on MedlinePlus, eMedicine and Diseases Database. Also, many of the articles have external links for further reading, and images to support and elaborate on the information provided. It is worth mentioning that the pericardial effusion article has a two-dimensional transthoracic echocardiogram animation of pericardial effusion, and the article on palpitations has two audio recordings demonstrating the difference between normal and abnormal heart rhythm. Such audiovisual aids aim at enhancing the understanding of the written description.

The articles covered key points such as disease aetiology, clinical picture and treatment. However, in most articles, there were deficiencies regarding the pathophysiology, mechanisms, diagnostic approach and management plan. Also, significant variations in covering the appropriate subtitles and key concepts were noted in some articles. For example, while the article on atrial fibrillation mentioned the definition of the disease, its classification, causes, clinical signs and symptoms and the diagnosis of a patient with atrial fibrillation as well as its pathophysiology, management, prognosis and epidemiology, the article on pulmonic regurgitation failed to address diagnosis, pathology, pathophysiology, epidemiology and management.

Although articles (entries) used images, including illustrations and tables to further explain the content, not all articles had images, illustrations or tables. For example, the articles on acute pericarditis, angina pectoris, chest pain, pulmonic regurgitation, pulmonic stenosis, tricuspid regurgitation and tricuspid stenosis had no images or tables. The total number of images and tables in the 47 Wikipedia entries was only 179; 3.8±3.1 (mean±SD).

Scientific accuracy and depth of coverage

Table 1 shows the median and the IQR for the accuracy scores of each article. The highest score was 45 for the article on deep vein thrombosis while the lowest was 28 for the article on acute pericarditis. The 47 Wikipedia articles had a median score=33, IQR=6—the highest possible score was 50. Out of the 47 entries, 4 (8.5%) articles scored 40 or higher, described as Good as per our cut-off system, 39 (83%) articles scored 30 to 39, described as Moderate as per our cut-off system and 4 (8.5%) scored less than 30, described as Poor as per our cut-off system. We did not observe vandalism of the 47 entries during the conduction of the study (raw data covering DISCERN scores made by three evaluators for each Wikipedia entry and the calculation of readability of the Wikipedia entries are shown in online supplementary material 1)

Table 1

Summarises the accuracy, the DISCERN scores (median and IQR), number of images, illustrations, tables and readability score of Wikipedia cardiovascular articles

Assessing the references

Table 2 summarises the mean and SD of the references included in the 47 Wikipedia cardiovascular entries. The total number of references was 1218, and the number of references in each article varied from 1 to 127. Of the 1218 references, 229 were from the 1990s and 6 references were from the 1980s. Thirty-five references did not mention the year of publication. The majority of these 1218 references were from peer-reviewed medical journals, making a total of 790 references (65%). Other references were professional medical websites (10%), educational/procedural guidelines (6%), textbooks (10%), general websites (5%), news and media (2%), and others (2%). Several errors were identified in citing the references: (1) a total of 182 (14.9%) were incomplete references, missing information about the author's name, title of article, year, date of retrieval of online websites and the book publisher, (2) 21 entries included a statement such as ‘citation needed’ after some sentences or paragraphs, which indicates the need to proof the origin of statements made, (3) inconsistency among entries or within the same entry in citing references was observed (n=29, 61.7%); some references were cited using the American Psychological Association (APA) style, others using Harvard style, and (4) URL links to several references were found to be broken or inaccessible (n=46, 3.8 %). It was also noticed that the quality of references varied substantially from peer-review journal articles to blogs reviewing episodes of comedy TV shows. On average, only 73.7% of references per article (entry) were accurate in all respects.

Table 2

Summarises the mean and SD of the references of the 47 English Wikipedia entries on cardiovascular diseases included in the study

Frequency of updating the topics

Among the 47 Wikipedia entries, the entry on bradyarrhythmias was the earliest article, created in 2001. Total number of updates for that article was 350, with 40 updates made in the past 12 months. On the other hand, the most recent article, created in 2008, was on pulmonic regurgitation, with 22 total updates and 4 updates in the past 12 months. The total number of updates of the 47 Wikipedia entries was in the range of 22 to 3700; 624.9±747.4 (mean±SD). While the total number of updates for the 47 entries in the past 12 months was in the range of 2 for the article on tricuspid stenosis to 215 for the article on myocardial infarction; 46.2±43.3 (mean±SD). Regardless of the year of creation, the frequency of updates varied significantly between articles. For example, the articles on arterial hypertension and endocarditis were created in 2002. During the past 11 years, arterial hypertension had 3700 edits and 100 edits in the past 12 months, making a yearly average of 336 edits, while endocarditis had 369 total edits over 11 years and 8 in the past 12 months, with a yearly average of 33 edits. Figure 1 shows a cumulative frequency graph (Ogive) for updates of the 47 Wikipedia entries on cardiovascular diseases since the date of their publication.

Figure 1

Cumulative frequency graph (Ogive) for updates of the 47 Wikipedia entries on cardiovascular diseases.

Table 3 summarises a few examples of knowledge deficiencies and scientific inaccuracies in four Wikipedia entries on cardiovascular diseases included in the study. The examples show that, despite the relatively good number of updates of each of these articles (conducted since the time of their publications and over the past 12 months), the articles were not free from knowledge deficiencies and inaccuracies. Suggestions for improvement were provided for each item identified.

Table 3

Examples of knowledge deficiencies and scientific inaccuracies in the English cardiovascular Wikipedia entries and suggestions for improvements

Correlation of the DISCERN score and other parameters

A correlation was found between the DISCERN score and the total number of peer-reviewed journals included in the references (R2=0.234, p<0.001) and between the DISCERN score and the number of peer-reviewed references in the past 5 years (R2=0.243, p<0.001). There was also a correlation between the DISCERN score and the total number of article updates (R2=0.333, p<0.001; figure 2) and a strong correlation between the number of pages and the total number of updates (R2=0.647, p<0.001; figure 3).

Figure 2

Correlation between the DISCERN score and total number of article updates for Wikipedia cardiovascular articles.

Figure 3

Correlation between number of article pages and the number of article updates for Wikipedia cardiovascular articles.

Calculating readability

Table 1 shows the readability scores of each Wikipedia entry measured by the Flesch-Kincaid Grade Level. The minimum score was 10.6±1.1 for the entry on bradyarrhythmias, while the maximum score was 20.6±8.3 for the entry on chest pain. The mean readability score was 14.3±1.7 (mean±SD). The results are consistent with a college reading level.

Comparing Wikipedia entries with respective articles in the Longo et al (2012) textbook

Table 4 summarises a comparison between the Wikipedia entries and the respective cardiovascular topics in the textbook by Longo et al (2012), Harrison’s Principles of Internal Medicine 18th Edition. The table shows that the textbook provided more images, graphs and tables than the Wikipedia articles did. The differences were not significant for the graphs (p=0.055) and images (p=0.998), but were for the tables (p=0.004). Although Wikipedia articles had more illustrations, the difference was not significant (p=0.148). However, the number of references was significantly less in the textbook compared to the Wikipedia articles (p=0.002). No significant differences were found regarding readability level (p=0.524). It is interesting to note that the new edition (the 19th Edition) of Harrison’s Principles of Internal Medicine, which appeared in the market after the submission of this paper, has an access to outstanding multimedia resources including practical videos demonstrating essential bedside procedures, physical examination techniques, endoscopic and cardiovascular findings. (Raw data covering Longo et al's (2012) number of images, illustrations, graphs, tables, references and readability are shown in online supplementary material 2).

Table 4

Comparing the 47 Wikipedia entries on cardiovascular diseases and corresponding topics in Longo et al (2012)

Agreement among evaluators

The inter-rater agreement for the modified DISCERN instrument was calculated using the Fleiss κ scale; the overall agreement was good (Fleiss κ 0.718 (95% CI 0.57 to 0.83).

Discussion

The aim of this study was to assess the accuracy and readability of Wikipedia entries on cardiovascular diseases and whether they are suitable as a learning resource for medical students. Although the Wikipedia entries covered the aetiology, clinical picture of the disease and its treatment, there were some deficiencies in the pathophysiology of diseases, signs and symptoms, diagnostic approach and treatment. Some entries were incomplete and treatment options available were not well covered. To assess the accuracy of contents of Wikipedia entries, the evaluators used a modified DISCERN instrument, because the original instrument: (1) has been criticised for not analysing the quality of the information in significant detail,38 (2) does not analyse tables, images or illustrations as part of its evaluation parameters and (3) was originally designed to analyse websites on patient's education and not academic work. The modified DISCERN instrument was used in an earlier publication21–23 and was piloted before its use in assessing the 47 entries. The piloting of the work familiarised the evaluators with applying the instrument and enabled them to master its use with minimal disagreement. The agreement between the evaluators was an acceptable value (Fleiss κ 0.718 (95% CI 0.57 to 0.83).32–34

The articles covered key points such as disease aetiology, clinical picture and treatment. However, there were deficiencies in most articles, regarding the pathophysiology of diseases, diagnostic approach and treatment. Also, significant variations in covering the appropriate subtitles and key concepts were noted in some articles.

In addition, key areas in some articles were not properly addressed and needed further details. For example, the diagnosis section in the article on pericardial effusion was only based on images of investigations such as radiology and ECG, with a brief explanation of the findings without addressing approaches needed to diagnose a patient with pericardial effusion. On the contrary, some articles contained information that was too detailed, such as the types and subtypes of heart valves, mentioned in the article on artificial heart valves. In addition, there was a variation in the length of the paragraphs within each article and the length of the articles.

The results from this study showed that the total number of references was 1218, and of these, 790 were from peer-reviewed journals. The number of references varied from 1 to 127 references; 25.9±29.3 (mean±SD). The references comprised a range of sources including peer-reviewed journals, educational/procedural guidelines, textbooks, professional websites, general websites, as well as news and media. Although the percentage of references from peer-reviewed journals comprised 65% of total references, peer-reviewed journals from the past 5 years comprised only 18% of the total references. These peer-reviewed journals included journals of high impact factors such as Hypertension, Circulation, the American Journal of Cardiology, the Lancet, the Journal of the American Medical Association and the New England Journal of Medicine. The correlation between the DISCERN scores and the total number of peer-reviewed journals included in the references (p<0.001), and between the DISCERN score and the number of peer-reviewed journals in the past 5 years (p<0.001), indicates the importance of peer-reviewed references to Wikipedia entries in enhancing and making their content more accurate.

However, the Wikipedia entries were deficient in citing educational guidelines produced by cardiology societies and associations such as the American Heart Association, the European Society of Cardiology and the British Hypertension Society. Only 6% of the references were guidelines. In addition, several problems were observed in the way the references were presented and cited. These findings are in agreement with Haigh,17 who concluded that Wikipedia citations should be treated with some caution.

The study also showed that Wikipedia entries are regularly updated. Yet, the frequency of updating an entry varied significantly (figure 1). The reason for this variability may be due to a number of factors: (1) the lack of an editorial team to distribute work and ensure regular revision of each Wikipedia entry, (2) the review process being dependent on the interest of those Wikipedians and viewers willing to contribute and (3) topics that are popular and frequently referred to being more likely to receive attention and most likely to attract the attention of Wikipedians. Collaborators from different backgrounds and different areas of interest usually conduct the update of Wikipedia entries. As per the system created by Wikipedia, anyone can update or start a new Wikipedia entry without even registering. There are no assigned editors or topic coeditors appointed to review articles and ensure that they are scientifically correct, up-to-date, organised and serve an educational purpose. This has resulted in variability in the quality of articles, their length and the accuracy of their content.

It is obvious from this study that some entries need images, illustrations or tables to enrich their educational value. Our findings showed that the mean number for images, illustrations and tables was 3.8±3.1, indicating the need for Wikipedians to consider this area when updating articles. Also, the inclusion of audio recordings, video clips and multimedia to articles is recommended. The comparison between the Wikipedia entries and the Textbook (table 4) shows that Harrison’s Principles of Internal Medicine 18th Edition had more tables, less references and no significant difference in number of graphs, images, illustrations or readability level.

There has been an extensive debate about whether health information, such as from Wikipedia, on the Internet should be controlled, rated or validated in some way by experts/professionals. Suggestions about possible means for achieving such goal to improve the quality of Wikipedia articles have been recently discussed.12 These suggestions included collaborating with health organisations and the involvement of experts to improve the content of medical articles on Wikipedia. Medical schools could challenge their students to analyse Wikipedia entries critically and contribute to editing articles.15 Also, medical and health-related societies/organisations could encourage their members or task groups to be involved in such projects and even provide a statement of approval on articles.39

Wikipedia has implemented several safety measures and quality surveillance mechanisms to identify and repair inaccurate information, and to improve the quality of its articles. For example, the initiation of a protection policy that limits contributors from editing or updating an article.40 However, until now, most of the articles are unprotected.41 Also, Wikipedia launched WikiProject, with the aim to improve its articles. The English Wikipedia comprises about 2000 WikiProjects and one of them is the WikiProject Medicine.42 The latter project aims to improve existing medical/health-related articles, enhance the quality of references and add additional pictures to the articles.15 Wikipedia also has a self-rating system of stubs, start-up to good and featured articles, indicating that Wikipedia aims at conveying not only content but also tries to lay an emphasis on the quality of entries. In addition, Wikipedia has an electronic citation template but its use is not obligatory. A suggestion to improve the writing of references is to control the referencing style by using an electronic template that all authors have to use, thus limiting inconsistencies when citing information. Other suggestions that can help in improving the quality of the cardiovascular articles may include adding audio clips for the sounds of murmurs, and video clips on cardiovascular diseases, labelling all diagrams and pictures, adding motion echocardiograms and linking embryology illustrations to congenital heart diseases. These improvements would enhance the quality of articles and their educational value.

This study is not free from limitations. First, Wikipedia articles on cardiovascular diseases may have improved since this research was conducted. Second, this research focused on adult cardiovascular diseases in the English language only and did not study articles on paediatric cardiovascular diseases or articles in other languages, so this research cannot be generalised to other medically related topics on Wikipedia. Third, despite piloting the use of the modified DISCERN instrument and the good inter-rater agreement between assessors, the modified DISCERN instrument may not enable the identification of each error or deficiency in the Wikipedia entry. Also textbooks were used as a resource reference and there may be some differences in their content. Fourth, the study focused on only 47 entries on common cardiovascular diseases and did not examine other entries such as pathology, anatomy, physiology, toxicology, microbiology, pharmacology and public health-related to the cardiovascular system.

Conclusions

Despite the enormous number of volunteers (Wikipedians) who have contributed to the Wikipedia cardiovascular entries, a number of deficiencies in the content, particularly in pathophysiology of diseases, signs and symptoms, diagnostic approaches and treatment, were noted. The use of images, illustrations and tables was not optimum and most entries did not include audio recordings, video clips or multimedia to enrich the educational purpose of these entries. Although peer-reviewed articles were cited in the references, several problems were encountered with the references. The references did not refer to educational guidelines and position statements produced by major international cardiovascular societies and associations. The entries lacked accuracy, predominantly due to errors of omission. However, Wikipedia is not intended to be a textbook for medical students. Wikipedia is deliberately not intended to meet the needs of a medical audience. It is written for general readers. Therefore, the deficiencies in Wikipedia entries may be considered when readers want to use these entries in place of proper medical resources, which is not the purpose of Wikipedia. However, improving the quality of Wikipedia cardiovascular articles by scholars carries the potential to improve the educational usefulness of this online resource.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors SAA, was in charge of research, including choosing the topic, formulating research questions, reviewing the literature, designing the study, collecting data, evaluating Wikipedia articles, data analysis, drafting and reviewing the manuscript critically for important intellectual content and leading the final revision of the manuscript. NMA-S, LAA-S and JMA-S, equally contributed to each of the following: conception and design, data collection, evaluation of Wikipedia articles, applying the DISCERN instrument, calculating readability, construction of tables, revision of the manuscript critically for important intellectual content and provided final approval of the manuscript.

  • Funding This work was supported by the College of Medicine Research Center, Deanship of Scientific Research, King Saud University, Riyadh, Saudi Arabia.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Additional data can be accessed via the Dryad data repository at http://datadryad.org/ with the doi:10.5061/dryad.q2kg1.