Article Text

Download PDFPDF

Automatic identification of type 2 diabetes, hypertension, ischaemic heart disease, heart failure and their levels of severity from Italian General Practitioners' electronic medical records: a validation study
  1. Rosa Gini1,2,
  2. Martijn J Schuemie3,4,
  3. Giampiero Mazzaglia5,
  4. Francesco Lapi5,
  5. Paolo Francesconi1,
  6. Alessandro Pasqua5,
  7. Elisa Bianchini5,
  8. Carmelo Montalbano6,
  9. Giuseppe Roberto1,
  10. Valentina Barletta1,
  11. Iacopo Cricelli6,
  12. Claudio Cricelli7,
  13. Giulia Dal Co8,
  14. Mariadonata Bellentani8,
  15. Miriam Sturkenboom2,
  16. Niek Klazinga9
  1. 1Agenzia regionale di sanità della Toscana, Osservatorio di epidemiologia, Florence, Italy
  2. 2Department of Medical Informatics, Erasmus Medical Center, Rotterdam, The Netherlands
  3. 3Department of Epidemiology Janssen Research & Development, Titusville, New Jersey, USA
  4. 4Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA
  5. 5Health Search, Italian College of General Practitioners and Primary Care, Florence, Italy
  6. 6Genomedics, Florence, Italy
  7. 7Italian College of General Practitioners and Primary Care, Florence, Italy
  8. 8Agenzia Nazionale per il Servizi Sanitari Regionali, Rome, Italy
  9. 9Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
  1. Correspondence to Dr Rosa Gini; rosa.gini{at}


Objectives The Italian project MATRICE aimed to assess how well cases of type 2 diabetes (T2DM), hypertension, ischaemic heart disease (IHD) and heart failure (HF) and their levels of severity can be automatically extracted from the Health Search/CSD Longitudinal Patient Database (HSD). From the medical records of the general practitioners (GP) who volunteered to participate, cases were extracted by algorithms based on diagnosis codes, keywords, drug prescriptions and results of diagnostic tests. A random sample of identified cases was validated by interviewing their GPs.

Setting HSD is a database of primary care medical records. A panel of 12 GPs participated in this validation study.

Participants 300 patients were sampled for each disease, except for HF, where 243 patients were assessed.

Outcome measures The positive predictive value (PPV) was assessed for the presence/absence of each condition against the GP's response to the questionnaire, and Cohen's κ was calculated for agreement on the severity level.

Results The PPV was 100% (99% to 100%) for T2DM and hypertension, 98% (96% to 100%) for IHD and 55% (49% to 61%) for HF. Cohen's kappa for agreement on the severity level was 0.70 for T2DM and 0.69 for hypertension and IHD.

Conclusions This study shows that individuals with T2DM, hypertension or IHD can be validly identified in HSD by automated identification algorithms. Automatic queries for levels of severity of the same diseases compare well with the corresponding clinical definitions, but some misclassification occurs. For HF, further research is needed to refine the current algorithm.

  • Validation study
  • electronic medical records

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Contributors RG, MJS, GM, PF, NK and MS conceived the study. CM designed and developed the data collection tool. GM, PF, FL and AP developed the algorithms. CM, AP, IC and CC conducted data collection. AP and EB conducted data analysis. VB and GR revised and translated the clinical terminology. RG wrote the manuscript and all the authors contributed with revisions and insight.

  • Funding This study was supported by the project named ‘Integrazione dei contenuti informativi per la gestione sul territorio di pazienti con patologie complesse o con patologie croniche’, MATRICE, funded by the Italian Ministry of Health in the framework of the MATTONI Program.

  • Competing interests MJS is an employee of a pharmaceutical company. Genomedics is a private company owned by IC. RG, FL, GR and MS have conducted pharmacoepidemiological studies funded by pharmaceutical companies.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement No additional data are available.