Abstract

Introduction. Routinely collected primary care data has underpinned research that has helped define primary care as a specialty. In the early years of the discipline, data were collected manually, but digital data collection now makes large volumes of data readily available. Primary care informatics is emerging as an academic discipline for the scientific study of how to harness these data. This paper reviews how data are stored in primary care computer systems; current use of large primary care research databases; and, the opportunities and challenges for using routinely collected primary care data in research.

Opportunities. (1) Growing volumes of routinely recorded data. (2) Improving data quality. (3) Technological progress enabling large datasets to be processed. (4) The potential to link clinical data in family practice with other data including genetic databases. (5) An established body of know-how within the international health informatics community.

Challenges. (1) Research methods for working with large primary care datasets are limited. (2) How to infer meaning from data. (3) Pace of change in medicine and technology. (4) Integrating systems where there is often no reliable unique identifier and between health (person-based records) and social care (care-based records—e.g. child protection). (5) Achieving appropriate levels of information security, confidentiality, and privacy.

Conclusion. Routinely collected primary care computer data, aggregated into large databases, is used for audit, quality improvement, health service planning, epidemiological study and research. However, gaps exist in the literature about how to find relevant data, select appropriate research methods and ensure that the correct inferences are drawn.

de Lusignan S and van Weel C. The use of routinely collected computer data for research in primary care: opportunities and challenges. Family Practice 2006. 23: 253–263.

Introduction

The founding fathers of academic primary care conducted research using routinely collected practice data. William Pickles' description of infectious disease,1 Frans Huygen's ‘Families with their illness’,2 and John Fry's ‘Common morbidity’3 were produced in an era of paper data collection and provide examples of general practice research that changed the face of medicine. One can only speculate as to what more Pickles, Huygen and Fry would have achieved, had they been able to work with the large multi-practice databases that computers make possible today. The first steps towards automated processing of data in general practice took place in the 1960s and 1970s with the creation of age–sex registers. These registers consisted of an individual card for each patient stored by gender and age. Cards could be punched with a hole to signify the administration of an immunization or that the person had a diagnosis of a chronic disease which could then be sorted with a knitting needle. This paper reviews how data are stored in primary care computer systems, and the opportunities and challenges for using routinely collected primary care data in research. After setting the stage of primary care informatics (PCI) three sections subsequently review:

  1. How data are stored in primary care computer systems;

  2. Current use of large databases of routinely collected data and

  3. The opportunities and challenges of conducting research based on routinely collected data.

PCI

Zuboff coined the term ‘to informate’ to describe the automation of process of using computers to automate information flows and create the capacity to derive new information.4 Its scientific study is termed informatics (Fig. 1). Health informatics is the study of data, information and knowledge and how to use this to improve health.5 Primary care is a distinct specialty,6 supported by its longitudinal medical records and decision making processes, justifying its own health informatics subspeciality. PCI is emerging as a scientific discipline whose importance has increased in parallel with the computerisation of primary care and the consequent availability of large volumes of routinely collected data.7

Figure 1

The benefit of implementing IT systems goes beyond simple automation of information flows. To ‘informate’ is the process of deriving new information as a result of automation

The adoption of computerised medical records is accelerating worldwide.8 Many countries have ambitious plans to integrate clinical records across all health providers9 with general practice records at their core. Family practice clinical computer systems are increasingly linked to other systems for patient registration and receiving laboratory test results. Scandinavia,10,11 The Netherlands12 and the UK13 have the longest tradition and highest level of computer use in general practice; though others are catching up.14 Integrating clinical records should improve patient safety, avoid duplication of tests, provide data to research and audit the effectiveness of care.1517 This may be particularly important in improving the management of chronic diseases.18,19 However, there are gaps in the PCI literature, particularly on methodologies of using large datasets of routinely collected clinical data. Although such data have been used for epidemiology and analysis of management of (chronic) disease,20,21 less is known about how to utilise databases most effectively.

How data are stored in primary care computer systems

Data in general practice computer systems are stored as either narrative or structured data

Clinical computer systems record data in two ways. Firstly, they allow the recording of ‘coded’ (also termed structured) data; this is usually done by selecting from a picking list or entering data into some sort of form or template. Secondly, most clinical computer systems also allow the entry of ‘free text’ or narrative. At present ‘coded’ data are needed because there are so many ways that a clinical concept can be represented. For example: a patient with coronary heart disease can be represented by any of the following free-text labels: ‘Diagnosis of myocardial infarction,’ ‘Raised cardiac enzymes,’ ‘Myocardial ischemia,’ ‘triple vessel coronary artery disease,’ ‘three vessel coronary artery bypass grafting,’ and so on … When looking to identify people with coronary artery disease it may be important to design a search strategy that includes all the ways a concept might be represented.

Natural language processing has not yet developed to the point to replace ‘coded’ clinical data

Unfortunately natural language processing (NLP), the process of searching the narrative record, has not reached the point where free-text can be automatically turned into ‘coded’ data; though progress is being made. NLP has been used to classify problem titles,22 to improve searching of bibliographic and genetic databases,23 to combine data from multiple discharge summaries,24 to automate links from medical text to the literature25 and to attempt to code data in email messages from patients.26 Although prediction is hazardous in rapidly developing fields, it is likely to be at least a decade before NLP can obviate the need for clinicians to code clinical data.

Coding systems and terminologies have got progressively larger and more sophisticated

A code is a simple representation (or label) given to something that allows it to be processed within an information system. Classifications provide a method of ordering information within a defined area or domain. The World Health Organisation's (WHO) International Classification of Disease (ICD), now in its 10th revision, is perhaps the best known classification; though many subclassifications and versions of it are in current use.27 A terminology should provide comprehensive labelling of all the concepts within a domain. For example, it will enable the inference that a person with diabetes is also a person with an endocrine disease. However, linking it to fasting blood sugar level as a diagnostic criterion is beyond the scope of a terminology.28 There is, as yet, no international consensus about the meaning of some of these labels; we have, therefore, provided definitions in Table 1. Readers who wish to explore this in more depth should compare the glossaries provided by ‘Open Clinical,’29 WONCA (World Organisation of Family Doctors),30 and the Australian Standard.31

Table 1

Definitions of Code, Classification, Terminology and Nomenclature


Definition
Examples
CodeA representation applied to a term so that it can be more readily processedRead code for asthma is H33
ClassificationArrangements of all elements of a domain, into groups according to established criteriaInternational Classification of Disease (ICD)
International Classification of Primary Care (ICPC)
TerminologyLanguage labels attached to a concept—all terms of a professional domainRead Clinical Terms Version 3, CT v3

Definition
Examples
CodeA representation applied to a term so that it can be more readily processedRead code for asthma is H33
ClassificationArrangements of all elements of a domain, into groups according to established criteriaInternational Classification of Disease (ICD)
International Classification of Primary Care (ICPC)
TerminologyLanguage labels attached to a concept—all terms of a professional domainRead Clinical Terms Version 3, CT v3
Table 1

Definitions of Code, Classification, Terminology and Nomenclature


Definition
Examples
CodeA representation applied to a term so that it can be more readily processedRead code for asthma is H33
ClassificationArrangements of all elements of a domain, into groups according to established criteriaInternational Classification of Disease (ICD)
International Classification of Primary Care (ICPC)
TerminologyLanguage labels attached to a concept—all terms of a professional domainRead Clinical Terms Version 3, CT v3

Definition
Examples
CodeA representation applied to a term so that it can be more readily processedRead code for asthma is H33
ClassificationArrangements of all elements of a domain, into groups according to established criteriaInternational Classification of Disease (ICD)
International Classification of Primary Care (ICPC)
TerminologyLanguage labels attached to a concept—all terms of a professional domainRead Clinical Terms Version 3, CT v3

There is no single standard system for recording structured data

There is no international standard approach to coding and classification.3235 Many countries use ICPC (International Classification for Primary Care) developed by the WONCA international classification committee.36 It is included in the WHO family of classifications and has been translated into 20 languages and modified to meet different countries' needs. Its alphanumeric coding is ordered anatomically and allows coding of reason for encounter, diagnostic and therapeutic procedures and diagnoses, directed at health problems with a prevalence of 1.0 or more per 1000 patients. For this 400 codes suffice.

UK primary care mainly uses Read at present and is due to migrate to SNOMED CT (Systematised Nomenclature for Medicine—Clinical Terms37). An overview of the history of these systems is set out in Boxes 1 and 2.

Box 1
History of ICPC (International Classification of Primary Care) and associated codes

1976—ICHPPC—International Classification of Health Problems in Primary Care

This was a list of common disorders met in primary care96 that was derived from ICD-8 (Eighth revision of the World Health Organisation's International Classification of Diseases97.) It was the forerunner of ICPC released in 1987. (see below.)

1984—RFEC—Reason for encounter classification98

1985—IC-process-PC-International classification of health process in Primary care99

These classification systems reflect the need to code more that just the disease or problem title. These two codes were brought together with ICHPPC to form ICPC. It has been represented formulaically as: (ICHPPC + RFEC + IC-process-PC = ICPC)

1987—ICPC–International Classification of Primary Care

Since 1972 the WONCA (World Organisation of National Colleges and Academies of Family Practice100) has been looking to develop an instrument to support research in general practice. In 1987 they released the first version of ICPC (International Classification for Primary Care.)

1994—ICD 10—International Classification of Diseases and related health problems

ICPC is used alongside ICD 10 (the World Health Organisation's—International Classification of Diseases). These classifications, ICPC and ICD-10, are distributed at very low cost, removing the financial barriers associated with the use of Read Codes or Clinical Terms. However ICPC and ICD-10 have limitations in that they are less comprehensive and therefore can represent fewer concepts accurately. Although satisfactory at recording diagnosis, they are much less good at recording detailed observations of actions taken as a result. Modifications of ICD-10 to overcome these limitations include: ICD-10 CM International Classification of Diseases, 10th Revision, Clinical Modification ICD-10 PCS International Classification of Diseases 10th revision, Procedure Classification System.

1997—ICPC2—ICPC Version 2

In 1997 WONCA published ICPC-2. Like the first version of ICPC it is a bi-axial coding system. There are 17 body system related chapters and seven components covering patient-orientated aspects of primary care: diagnosis, reason for encounter, etc. ICPC-2 has been released in over 20 languages. It was extended to ICPC-Plus in 1998 in Australia; additional terms allow more detailed meaning to be provided101 than with the standard ICPC-2 release.

2003—ICPC-2,2003

Latest update of ICPC is designated ICPC-2, 2003. Accepted by the WHO into the Family of International Classifications (FIC) as a reason for encounter classification and as a classification for primary care or general practice whenever applicable. It also has comprehensive mapping to ICD-10.

2005—ICPC-3 planned

To overcome problems mapping ICPC to ICD-10, and to develop a more internationally usable classification WONCA is to develop ICPC-3 which will map to ICD-11.

Box 2
History of Read codes and SNOMED (Systematized Nomenclature for Medicine)

1965—SNOP (Systemized Nomenclature of Pathology)

Introduced by College of American Pathologists (CAP) this was the forerunner of SNOMED RT, see below.

1997—SNOMED (Systemized Nomenclature of Medicine)

Extended version of SNOP from CAP.

1983—Read code version 1

In 1983 James Read's codes were released, and these have gone on to become the UK national standard. The initial coding systems were about compactness, as early computer systems had so little memory. Later on it became more important that they were comprehensive. These early versions of Read codes (Versions 1 and 2) were hierarchical, like a family tree, with ‘parent’ and ‘child’ codes.

1988—Read codes version 2, 4- and 5-byte sets

This code set was recommended for use by the Joint Computing Committee of the British Medical Association and Royal College of General Practitioners in the UK. It has ∼30 000 terms. It has been superseded in most general practice computer systems by the 5-byte code set that offers ∼100 000 terms102.

1993–1993. SNOMED-RT (Reference Terminology)

CAP release the most extensive version thus far.

1994—Read version 3, Clinical Terms (CT v3)

In 1994 a concept based coding system was developed (Read 3)103, also known as ‘Clinical Terms.’ The intention was to develop a terminology that could include specialist practice as well as general practice. It has over 200 000 terms. This will not be developed independently in the UK, but instead the Read codes have been merged with an American coding system called SNOMED (Systematized Nomenclature for Medicine). The new combined version is to be known as SNOMED CT (CT for clinical terms.)

1999—SNOMED CT—Systematized Nomenclature for Medicine—Clinical Terms

(Read CT v3) + (SNOMED RT) = (SNOMED CT)

In 1999 the UK Health Minister and the American College of Pathologists announced a joint venture to develop SNOMED CT by late 2001. This was to be a combination of the SNOMED RT (Reference Terminology)104 developed in the United Stages and Clinical Terms version 3 developed in the UK105. The NHS is due to migrate to SNOMED CT37.

The most extensive terminology to date; it has a semantic net of over 300 000 medical concepts; there are multiple axes and hierarchies and over 7 million relationships.

Other regions and countries have developed their systems to meet their needs. For example Australia has developed its own modification of ICD, ICD-10-AM (Australian Modification38) and extended ICPC-2 to produce a nomenclature that included chronic conditions routinely managed in Australian primary care ICPC-Plus.39,40 In 2005, development of ICPC-3 got underway, seeking to cross-link to ICD-10, the ICF functional status, ATC drugs classification and others. It will also fundamentally revise the chapter on social problems. This way, ICPC will become the basic structure of health information.

The main difference between the different coding systems is their level of granularity. ICPC is the smallest (offering fewest coding alternatives), ICD-10 comes somewhere in the middle, and Read Clinical Terms (CT v3) provides a complex system. The impact of using different coding systems is illustrated in Box 3, where a Dutch GP using ICPC, compares their choice of codes with an Icelandic GP using ICD-10, and an English GP using CT v3.

Box 3
Comparing the coding of pneumonia in general practice using ICPC, ICD-10 and CTv3

(1) Coding pneumonia with ICPC (International Classification of Primary Care)

A Dutch GP using this system is presented with a choice of only two options

(i) Bronchopneumonia OR (ii) Other pneumonia.

If this GP wishes to code in more detail they must use ICD.

(2) Coding pneumonia with ICD-10 (International Classification of Diseases and Health Related Problems)

An Icelandic GP will code problems using this system. They are presented with a choice of 80 codes. These relate to different causes of pneumonia including congenital cause, infections that may result in pneumonia and the type.

(3) Coding pneumonia with CT v3 (Read Clinical Terms version 3)

A UK GP using this system will be presented with a choice of 182 alternatives. These alternatives have far finer granularity than those offered by ICD, e.g. Right middle zone pneumonia. Many of these have lower levels with further codes available.

Using large databases of routinely collected primary care data

Large GP datasets used in UK primary care research

Aggregated practice data presented in research have usually been collected with a single brand of computer system. Contributing practices may have had special training or feedback, and only those who have data above a certain quality level may be included.41 Reliance on a subset of practices can over-state data quality when generalising to routine practice.42,43 Relatively little research has been done in comparisons between different computer systems and their advantages and disadvantages.44,45 The principal networks used for research in the UK are shown in Table 2.

Table 2

Large single computer system databases currently in use in the UK

Large databaseClinical systemPrevious namesURL



Large database
Clinical system
General Practice Research Database (GPRD)In-practice Systems (IPS)VAMP, Reuters Visionwww.gprd.comwww.inps.co.uk
Mediplus databaseiSoft-TorexTorex Meditelwww.ims-global.com/index.htmlwww.isoftplc.co.uk
The Doctors Independent Network (DIN)Torex Meditelwww.dinweb.org/dinwebwww.isoftplc.co.uk
Q-researchEgton Medical Information Systems (EMIS)Nonewww.qresearch.orgwww.emis-online.com
Large databaseClinical systemPrevious namesURL



Large database
Clinical system
General Practice Research Database (GPRD)In-practice Systems (IPS)VAMP, Reuters Visionwww.gprd.comwww.inps.co.uk
Mediplus databaseiSoft-TorexTorex Meditelwww.ims-global.com/index.htmlwww.isoftplc.co.uk
The Doctors Independent Network (DIN)Torex Meditelwww.dinweb.org/dinwebwww.isoftplc.co.uk
Q-researchEgton Medical Information Systems (EMIS)Nonewww.qresearch.orgwww.emis-online.com
Table 2

Large single computer system databases currently in use in the UK

Large databaseClinical systemPrevious namesURL



Large database
Clinical system
General Practice Research Database (GPRD)In-practice Systems (IPS)VAMP, Reuters Visionwww.gprd.comwww.inps.co.uk
Mediplus databaseiSoft-TorexTorex Meditelwww.ims-global.com/index.htmlwww.isoftplc.co.uk
The Doctors Independent Network (DIN)Torex Meditelwww.dinweb.org/dinwebwww.isoftplc.co.uk
Q-researchEgton Medical Information Systems (EMIS)Nonewww.qresearch.orgwww.emis-online.com
Large databaseClinical systemPrevious namesURL



Large database
Clinical system
General Practice Research Database (GPRD)In-practice Systems (IPS)VAMP, Reuters Visionwww.gprd.comwww.inps.co.uk
Mediplus databaseiSoft-TorexTorex Meditelwww.ims-global.com/index.htmlwww.isoftplc.co.uk
The Doctors Independent Network (DIN)Torex Meditelwww.dinweb.org/dinwebwww.isoftplc.co.uk
Q-researchEgton Medical Information Systems (EMIS)Nonewww.qresearch.orgwww.emis-online.com

Another approach to the acquisition of general practice data is to use data extraction tools which extract the data required for a specific study. Generally, studies using this approach extract data from a range of different computer systems.20 Hybrid solutions are also possible with a wider sample of data of general practices being extracted to answer a specific research question.20,46

Examples of large GP databases worldwide

Internationally, there are a large number of practice networks and databases available. They vary greatly in methodology, size and type of data collected. At the smaller end of the scale the Nijmegen academic family practice research network, founded in 1971, comprises four practices (11 practitioners) collecting a wide range of longitudinal data.4750 At the larger end data from the Veterans Association are collected from American military establishments and from medical facilities which support ex-servicemen around the globe.51,52

Some GP data are still collected using paper forms. One of the best established is the National Ambulatory Medical Care Survey (NAMCS) in the USA, which collect data from sample practices about their case-mix and workload every year.53 Also BEACH (Bettering the Evaluation And Care of Health) in Australia uses paper to collect continuous data from 20 practices.54 Paper is also used in New Zealand, though the data are then transmitted centrally electronically.55 Examples of computerised data bases are summarised in Table 3.

Table 3

General Practice research databases


Database
Organisation
Coding system
URL
The Netherlands(1) Amsterdam Transition Project(1) University of Amsterdam(1) ICPC ICPC-2/ICD-10 thesaurushttp://www.onderzoekinformatie.nl/en/oi/nod/onderzoek/OND1284701/
http://www.ulb.ac.be/esp/emd/nl_metsemakers.htm
http://www.ebp-umcn.org/home.htm
(2) Maastricht database(2) Department of GP Maastricht University(2) ICPC
(3) Nijmegen academic family practice research network(3) University of Nijmegen l(3) ICPC
FinlandStakes statistical databasesNational Research and Development Centre for Welfare and Health (STAKES, Finland).ICD-9, now ICD-10http://info.stakes.fi/nettihilmo/english/default.htm
DenmarkContinuous morbidity registration and quality development combined by using the Extended Danish International Classification of Primary CareUniversity of Odense, Dep. of General practiceExtended Danish ICPC (ICPC-E). A small part of ICD-10 relevant to primary care in included in ICPCwww.ulb.ac.be/esp/emd/dk_falkoe.htm
NorwayData retrieval in general practice in NorwayDepartment of Community Health and General Practice University of TrondheimAll problems labelled using ICPCwww.ulb.ac.be/esp/emd/no_grimsmo.htm
FranceThe Sentinel Surveillance of Referral to Hospital in French Primary CareWHO Collaborating Centre for Electronic Disease Surveillance, Paris, FranceICPCwww.ulb.ac.be/esp/emd/fr_letrilliard.htm
http://rhone.b3e.jussieu.fr/senti/en/docs/plaquette.pdf (in French)
Italy(1) Health Search Database (HSD)(1) Italian General Medicine Society (SIMG) + Italian College of General Practitionershttp://www.healthsearch.it/report2002_pdf/Divisi5/Report2002_4.pdf (Abstracts in English)
http://www.healthsearch.it/ (In Italian)
http://www.csermeg.it/eng/cs-idx.htm
(2) CSeRMEG: General Practice Research Group(2) Centro Studi e Ricerche in Meidcina Generale
New ZealandDunedin Research UnitRoyal New Zealand College of General Practitioners' (RNZCGP) Dunedin Research Unithttp://www.nzma.org.nz/journal/115-1163/200/
USA(1) Department of Veterans Affairs(1) Department of Veterans Affairs(1) SNOMED-CThttp://www.hhs.gov/healthit/attachment_2/iii.html
http://www.musc.edu/PPRNet/index.htm
(2) Practice Partner Network (PPRNET)(2) Medical University of South Carolina (MUSC) + Physician Micro Systems, Inc. (PMSI)(2) ICD-CM (Clinical modification)

Database
Organisation
Coding system
URL
The Netherlands(1) Amsterdam Transition Project(1) University of Amsterdam(1) ICPC ICPC-2/ICD-10 thesaurushttp://www.onderzoekinformatie.nl/en/oi/nod/onderzoek/OND1284701/
http://www.ulb.ac.be/esp/emd/nl_metsemakers.htm
http://www.ebp-umcn.org/home.htm
(2) Maastricht database(2) Department of GP Maastricht University(2) ICPC
(3) Nijmegen academic family practice research network(3) University of Nijmegen l(3) ICPC
FinlandStakes statistical databasesNational Research and Development Centre for Welfare and Health (STAKES, Finland).ICD-9, now ICD-10http://info.stakes.fi/nettihilmo/english/default.htm
DenmarkContinuous morbidity registration and quality development combined by using the Extended Danish International Classification of Primary CareUniversity of Odense, Dep. of General practiceExtended Danish ICPC (ICPC-E). A small part of ICD-10 relevant to primary care in included in ICPCwww.ulb.ac.be/esp/emd/dk_falkoe.htm
NorwayData retrieval in general practice in NorwayDepartment of Community Health and General Practice University of TrondheimAll problems labelled using ICPCwww.ulb.ac.be/esp/emd/no_grimsmo.htm
FranceThe Sentinel Surveillance of Referral to Hospital in French Primary CareWHO Collaborating Centre for Electronic Disease Surveillance, Paris, FranceICPCwww.ulb.ac.be/esp/emd/fr_letrilliard.htm
http://rhone.b3e.jussieu.fr/senti/en/docs/plaquette.pdf (in French)
Italy(1) Health Search Database (HSD)(1) Italian General Medicine Society (SIMG) + Italian College of General Practitionershttp://www.healthsearch.it/report2002_pdf/Divisi5/Report2002_4.pdf (Abstracts in English)
http://www.healthsearch.it/ (In Italian)
http://www.csermeg.it/eng/cs-idx.htm
(2) CSeRMEG: General Practice Research Group(2) Centro Studi e Ricerche in Meidcina Generale
New ZealandDunedin Research UnitRoyal New Zealand College of General Practitioners' (RNZCGP) Dunedin Research Unithttp://www.nzma.org.nz/journal/115-1163/200/
USA(1) Department of Veterans Affairs(1) Department of Veterans Affairs(1) SNOMED-CThttp://www.hhs.gov/healthit/attachment_2/iii.html
http://www.musc.edu/PPRNet/index.htm
(2) Practice Partner Network (PPRNET)(2) Medical University of South Carolina (MUSC) + Physician Micro Systems, Inc. (PMSI)(2) ICD-CM (Clinical modification)
Table 3

General Practice research databases


Database
Organisation
Coding system
URL
The Netherlands(1) Amsterdam Transition Project(1) University of Amsterdam(1) ICPC ICPC-2/ICD-10 thesaurushttp://www.onderzoekinformatie.nl/en/oi/nod/onderzoek/OND1284701/
http://www.ulb.ac.be/esp/emd/nl_metsemakers.htm
http://www.ebp-umcn.org/home.htm
(2) Maastricht database(2) Department of GP Maastricht University(2) ICPC
(3) Nijmegen academic family practice research network(3) University of Nijmegen l(3) ICPC
FinlandStakes statistical databasesNational Research and Development Centre for Welfare and Health (STAKES, Finland).ICD-9, now ICD-10http://info.stakes.fi/nettihilmo/english/default.htm
DenmarkContinuous morbidity registration and quality development combined by using the Extended Danish International Classification of Primary CareUniversity of Odense, Dep. of General practiceExtended Danish ICPC (ICPC-E). A small part of ICD-10 relevant to primary care in included in ICPCwww.ulb.ac.be/esp/emd/dk_falkoe.htm
NorwayData retrieval in general practice in NorwayDepartment of Community Health and General Practice University of TrondheimAll problems labelled using ICPCwww.ulb.ac.be/esp/emd/no_grimsmo.htm
FranceThe Sentinel Surveillance of Referral to Hospital in French Primary CareWHO Collaborating Centre for Electronic Disease Surveillance, Paris, FranceICPCwww.ulb.ac.be/esp/emd/fr_letrilliard.htm
http://rhone.b3e.jussieu.fr/senti/en/docs/plaquette.pdf (in French)
Italy(1) Health Search Database (HSD)(1) Italian General Medicine Society (SIMG) + Italian College of General Practitionershttp://www.healthsearch.it/report2002_pdf/Divisi5/Report2002_4.pdf (Abstracts in English)
http://www.healthsearch.it/ (In Italian)
http://www.csermeg.it/eng/cs-idx.htm
(2) CSeRMEG: General Practice Research Group(2) Centro Studi e Ricerche in Meidcina Generale
New ZealandDunedin Research UnitRoyal New Zealand College of General Practitioners' (RNZCGP) Dunedin Research Unithttp://www.nzma.org.nz/journal/115-1163/200/
USA(1) Department of Veterans Affairs(1) Department of Veterans Affairs(1) SNOMED-CThttp://www.hhs.gov/healthit/attachment_2/iii.html
http://www.musc.edu/PPRNet/index.htm
(2) Practice Partner Network (PPRNET)(2) Medical University of South Carolina (MUSC) + Physician Micro Systems, Inc. (PMSI)(2) ICD-CM (Clinical modification)

Database
Organisation
Coding system
URL
The Netherlands(1) Amsterdam Transition Project(1) University of Amsterdam(1) ICPC ICPC-2/ICD-10 thesaurushttp://www.onderzoekinformatie.nl/en/oi/nod/onderzoek/OND1284701/
http://www.ulb.ac.be/esp/emd/nl_metsemakers.htm
http://www.ebp-umcn.org/home.htm
(2) Maastricht database(2) Department of GP Maastricht University(2) ICPC
(3) Nijmegen academic family practice research network(3) University of Nijmegen l(3) ICPC
FinlandStakes statistical databasesNational Research and Development Centre for Welfare and Health (STAKES, Finland).ICD-9, now ICD-10http://info.stakes.fi/nettihilmo/english/default.htm
DenmarkContinuous morbidity registration and quality development combined by using the Extended Danish International Classification of Primary CareUniversity of Odense, Dep. of General practiceExtended Danish ICPC (ICPC-E). A small part of ICD-10 relevant to primary care in included in ICPCwww.ulb.ac.be/esp/emd/dk_falkoe.htm
NorwayData retrieval in general practice in NorwayDepartment of Community Health and General Practice University of TrondheimAll problems labelled using ICPCwww.ulb.ac.be/esp/emd/no_grimsmo.htm
FranceThe Sentinel Surveillance of Referral to Hospital in French Primary CareWHO Collaborating Centre for Electronic Disease Surveillance, Paris, FranceICPCwww.ulb.ac.be/esp/emd/fr_letrilliard.htm
http://rhone.b3e.jussieu.fr/senti/en/docs/plaquette.pdf (in French)
Italy(1) Health Search Database (HSD)(1) Italian General Medicine Society (SIMG) + Italian College of General Practitionershttp://www.healthsearch.it/report2002_pdf/Divisi5/Report2002_4.pdf (Abstracts in English)
http://www.healthsearch.it/ (In Italian)
http://www.csermeg.it/eng/cs-idx.htm
(2) CSeRMEG: General Practice Research Group(2) Centro Studi e Ricerche in Meidcina Generale
New ZealandDunedin Research UnitRoyal New Zealand College of General Practitioners' (RNZCGP) Dunedin Research Unithttp://www.nzma.org.nz/journal/115-1163/200/
USA(1) Department of Veterans Affairs(1) Department of Veterans Affairs(1) SNOMED-CThttp://www.hhs.gov/healthit/attachment_2/iii.html
http://www.musc.edu/PPRNet/index.htm
(2) Practice Partner Network (PPRNET)(2) Medical University of South Carolina (MUSC) + Physician Micro Systems, Inc. (PMSI)(2) ICD-CM (Clinical modification)

The opportunities and challenges of routinely collected primary care data

These are summarised in Fig. 2.

Figure 2

The opportunities and challenges of using routinely collected primary care data in research

Opportunities

Growing volumes of data are routinely recorded

More and more data are routinely collected using fewer coding systems. Telematics, patient held records, all add to the increasingly accessible volume of data.56

Some data, notably repeat prescribing, blood pressure and major morbidities are complete and accurate but other data less so.57 On their own, such data make little sense. Data items need to be related to individual patients' characteristics: sex, age, social class, use of health care facilities, medical family history, life events etc.—the ‘thick and rich;’ of primary care research.58

Improving data quality

Feedback,44 incentives,59 and evidence-based guidelines, all contribute to improvements in data quality. Once quality procedures are in place, it is possible to achieve high data quality standards.60,61 A complete and accurate dataset is important in clinical research as missing data are hard to interpret. As practices ‘informate’, procedures become routine so data quality improves; for example, computer data are used to screen for depression,62 recall patients63 and provide reminders to physicians.64

Technical progress

Technical barriers to processing data are diminishing. Hardware can readily process larger volumes of data. Disk size is growing faster than text-based records. Communications technology allows integration of clinical records and data derived from home monitoring.

The importance of unique patient identifiers is increasingly recognised. Unique identifiers are critical if data relating to the same individual is to be linked. The ‘community health index number’ used in Scotland is a good example of a simple effective solution.65 The NHS number66 could fulfil the same role in England and Wales.

Proprietary and open source tools are making data extraction easier.67 Interoperability between systems is enabled by standards for exchange, management and sharing of information. For example, Health Level 7 (HL7) provides a framework for achieving this at the application level—e.g. allowing physicians using a clinical system to request an X-ray appointment on the radiology computer system.68

Opportunities exist to link GP data with genetic data

A great deal of post-genomic research is generic, rather than ‘disease specific’ which makes generic primary care databases attractive for genetic research. Exciting opportunities exist to link primary care data with genetic data. As our knowledge about genetics grows computed data will allow exploration of the genetic–environmental balance in the causation of a wide range of diseases and even eventually enabling pharmacotherapy to be personalised.69,70 Iceland has developed a genetic database linked to clinical computer systems.71 UK Biobank will also link genetic and primary care data,72 emphasising the relevance of primary care databases in this research enterprise.

International PCI community

There is an international network of academics and practitioners with a special interest in PCI. Health informatics also has its own journals and conferences. Groups can be found in general practice family practice organisations as well as in the international informatics associations. The major organisations, their associated journal and conferences are shown in Table 4.

Table 4

International PCI groups

Initials
Full name
URL + Primary Care working group (PCI WG)
Associated Medial Informatics Journal + URL
Associated Conference
EFMIEuropean Federation for Medical Informaticswww.efmi.org PCI WG: Follow WG linksInternational Journal of Medical Informatics*www.elsevier.com/locate/ijmedinf Health Informatics Europe*—on-line www.hi-europe.info/MIE–Medical Informatics Europe, yearly except MEDINFO years
PHCSG BCSPrimary Healthcare Specialist Group of the British Computer Societywww.phcsg.org.uk Contains a number of special interest groupsInformatics in Primary Care*www.radcliffe-oxford.com/journals/J12_Informatics_in_Primary_Care/default.htm Journal archive, 1995–2001 http://www.primis.nhs.uk/informatics/Annual Conference (June) and Scientific (September)
AMIAAmerican Medical Informatics Associationwww.amia.org PCI WG: follow WG linksJAMIA—Journal of the American Medical Informatics Association www.jamia.orgAnnual symposium. (November), except MEDINFO years
IMIAInternational Medical Informatics Associationwww.imia.org PCI WG: Follow WG linksMethods of Information in Medicineawww.schattauer.de/zs/methods/main.aspMEDINFO—every three years. Next Australia 2007
SIMThe Society for the Internet in Medicinewww.internet-in-medicine.org/JMIR—Journal of Medical Internet Research*www.jmir.org Medical Informatics and the Internet in Medicine*www.tandf.co.uk/journals/titles/14639238.htmMEDNET conference annually in the autumn
Initials
Full name
URL + Primary Care working group (PCI WG)
Associated Medial Informatics Journal + URL
Associated Conference
EFMIEuropean Federation for Medical Informaticswww.efmi.org PCI WG: Follow WG linksInternational Journal of Medical Informatics*www.elsevier.com/locate/ijmedinf Health Informatics Europe*—on-line www.hi-europe.info/MIE–Medical Informatics Europe, yearly except MEDINFO years
PHCSG BCSPrimary Healthcare Specialist Group of the British Computer Societywww.phcsg.org.uk Contains a number of special interest groupsInformatics in Primary Care*www.radcliffe-oxford.com/journals/J12_Informatics_in_Primary_Care/default.htm Journal archive, 1995–2001 http://www.primis.nhs.uk/informatics/Annual Conference (June) and Scientific (September)
AMIAAmerican Medical Informatics Associationwww.amia.org PCI WG: follow WG linksJAMIA—Journal of the American Medical Informatics Association www.jamia.orgAnnual symposium. (November), except MEDINFO years
IMIAInternational Medical Informatics Associationwww.imia.org PCI WG: Follow WG linksMethods of Information in Medicineawww.schattauer.de/zs/methods/main.aspMEDINFO—every three years. Next Australia 2007
SIMThe Society for the Internet in Medicinewww.internet-in-medicine.org/JMIR—Journal of Medical Internet Research*www.jmir.org Medical Informatics and the Internet in Medicine*www.tandf.co.uk/journals/titles/14639238.htmMEDNET conference annually in the autumn

Journals marked with an asterisk may be an official journal of one or more informatics associations.

Table 4

International PCI groups

Initials
Full name
URL + Primary Care working group (PCI WG)
Associated Medial Informatics Journal + URL
Associated Conference
EFMIEuropean Federation for Medical Informaticswww.efmi.org PCI WG: Follow WG linksInternational Journal of Medical Informatics*www.elsevier.com/locate/ijmedinf Health Informatics Europe*—on-line www.hi-europe.info/MIE–Medical Informatics Europe, yearly except MEDINFO years
PHCSG BCSPrimary Healthcare Specialist Group of the British Computer Societywww.phcsg.org.uk Contains a number of special interest groupsInformatics in Primary Care*www.radcliffe-oxford.com/journals/J12_Informatics_in_Primary_Care/default.htm Journal archive, 1995–2001 http://www.primis.nhs.uk/informatics/Annual Conference (June) and Scientific (September)
AMIAAmerican Medical Informatics Associationwww.amia.org PCI WG: follow WG linksJAMIA—Journal of the American Medical Informatics Association www.jamia.orgAnnual symposium. (November), except MEDINFO years
IMIAInternational Medical Informatics Associationwww.imia.org PCI WG: Follow WG linksMethods of Information in Medicineawww.schattauer.de/zs/methods/main.aspMEDINFO—every three years. Next Australia 2007
SIMThe Society for the Internet in Medicinewww.internet-in-medicine.org/JMIR—Journal of Medical Internet Research*www.jmir.org Medical Informatics and the Internet in Medicine*www.tandf.co.uk/journals/titles/14639238.htmMEDNET conference annually in the autumn
Initials
Full name
URL + Primary Care working group (PCI WG)
Associated Medial Informatics Journal + URL
Associated Conference
EFMIEuropean Federation for Medical Informaticswww.efmi.org PCI WG: Follow WG linksInternational Journal of Medical Informatics*www.elsevier.com/locate/ijmedinf Health Informatics Europe*—on-line www.hi-europe.info/MIE–Medical Informatics Europe, yearly except MEDINFO years
PHCSG BCSPrimary Healthcare Specialist Group of the British Computer Societywww.phcsg.org.uk Contains a number of special interest groupsInformatics in Primary Care*www.radcliffe-oxford.com/journals/J12_Informatics_in_Primary_Care/default.htm Journal archive, 1995–2001 http://www.primis.nhs.uk/informatics/Annual Conference (June) and Scientific (September)
AMIAAmerican Medical Informatics Associationwww.amia.org PCI WG: follow WG linksJAMIA—Journal of the American Medical Informatics Association www.jamia.orgAnnual symposium. (November), except MEDINFO years
IMIAInternational Medical Informatics Associationwww.imia.org PCI WG: Follow WG linksMethods of Information in Medicineawww.schattauer.de/zs/methods/main.aspMEDINFO—every three years. Next Australia 2007
SIMThe Society for the Internet in Medicinewww.internet-in-medicine.org/JMIR—Journal of Medical Internet Research*www.jmir.org Medical Informatics and the Internet in Medicine*www.tandf.co.uk/journals/titles/14639238.htmMEDNET conference annually in the autumn

Journals marked with an asterisk may be an official journal of one or more informatics associations.

Challenges

Identify appropriate research methods for analysing data from primary care databases

Details about the research methods employed by researchers using primary care databases need to be described in much greater detail. Descriptions are often inadequate to allow other researchers to replicate the studies performed. Comparisons between databases and about the validity of data from well equipped practices73 have been made. However, it is often unclear how decisions were made to include or exclude practices, patients or individual data items from studies.

How to infer meaning

Research often requires an accurate denominator which is most easily achieved when health systems provide primary care listing of patients.74 Problems with health systems' administration is likely to result in inaccurate reporting of morbidity nevertheless.75,76 Without a reliable denominator it is more difficult to calculate or compare incidence or prevalence.77 Surrogate markers (e.g. using prescriptions of thyroxine as a marker for the diagnosis of myxeodema) can be used to infer the incidence and prevalence of disease.78

The architecture of the computerised record is also important; data linkage and problem orientation can facilitate research. Data-linkage, where treatments and interventions are linked to the relevant problem, can help with data interpretation. Unfortunately it is not a feature of all computer systems. In problem-orientated medical records data are readily grouped by ‘problem’ rather than being a simple chronological record. Computers readily lend themselves to problem-orientated displays—so long as the data is appropriately coded. For example, in many systems it is possible to display all the consultations for hypertension; and the drugs prescribed for that problem.79

End points can be difficult to gather from primary care data and linking ICPC to functional status classification might present a vital development here. Capturing data about death or reasons that patients stop attending a physician can be difficult; though the French Sentinel network appears to have overcome many of these problems capturing reliable suicide data80 and in The Netherlands they have achieved the same in diabetes and heart disease.81

A recent systematic review concluded that lack of a gold standard for completeness and accuracy of data hampers assessment of quality.82 There is often marked inter-practice variation in data quality.83,84,85 For example in the Health Search Database (HSD) in Italy has shown that for some diseases, patient-reported prevalence is the same as the primary care record, in others it is lower;86 though the French Sentinel system found good correlation between hospital and practice diagnoses.85

Payments to GPs can distort coding practice. When GPs receive payments for specific diagnoses (pneumonia but not upper respiratory tract infection), interventions (for prescribing, but not for advice or wait-and-see) or performances (home visits, but not for nurse-led clinic) it is likely that GP records will report antibiotics-treated pneumonia in a home visit, rather than common cold that was advised by the nurse to wait-and-see. Medical certification for sickness absence can also skew data recording.87

Pace of change

The capacity and capability of computers to process data and the rate of change in health services appears to be accelerating. There are particular problems which can arise when changes occur.88,89 Data loss can be a problem when changing computer system or coding system. The Nijmegen family practice research network has overcome the risk of data loss by continuing the same e-book system for more than 30 years.90,91 This is a luxury not afforded to other data collecting networks which must change in line with reforms in their health system. This may become more of a problem as the pace of change accelerates.

Integrating systems

Problems with the lack of a reliable unique identifier for patients make linkage with other systems challenging. Where links have been made between morbidity data and socio-economic data these comparisons have proved useful.92

Ethical issues: data ownership, security, confidentiality and privacy

Ownership of data is a complex issue. GPs collect data that are often shared by third parties (Departments of health, insurance companies) and to what extent should third parties be placed in a position to access these data for (their) research remains unclear. Clarification is needed about when identifiable or anonymised data can be used in public health. When and how consent should be obtained and the degree of anonymisation required also need better definition.93

However, it appears that patients may be in blissful ignorance about the risks of secondary use of their data than physicians are on their behalf. It appears that a minority of patients have clear opinions about whom they wish their records to be shared.94 Most (79%) patients are happy to see their data used for research by not-for-profit organisations.95

Discussion

Computerisation provides cheap and relatively easy access to large volumes of data; and data-quality is improving all the time. Informatics opens possibilities for research on a scale unprecedented in its history. This is an opportunity for innovation which should be capitalised on.

The challenges faced by researchers working with primary care data are the lack of tools to explore a narrative record, and inferring meaning when working with incomplete data. When studying the natural history of a disease, it is essential that all patients studied actually have the disease—we have not only to consider whether or not the GP did make a correct diagnosis but also whether the data were recorded correctly in the computer. When using computer data to study the use of health care resources and quality of care: the computer is likely to have an accurate record of the biomedicine that was practiced—whether or not this was relevant to the patient's underlying health problem. As coding systems change we will need to work out how to incorporate historic data into longitudinal records. Ethical issues about data ownership and permissions remain problematic.

It is inevitable that routinely collected data will be used more and more in research. For research into health recourses use and quality of care, often ‘simple’ service data may suffice. However, for clinical research more in-depth supervised data are required. This needs investment into (research) practices and this will only be possible in a selected number of practices, linking to other databases (death certificates and secondary care) to further improve the quality of the data. Research in primary care requires large numbers of data, in particular when assessing the odds of serious, rare diseases and complications behind every-day symptoms and signs. Given the numbers of general practice databases available, an innovative scientific development might be linking international data bases and research networks.

Conclusions

Technological developments, the commitment of health services to implement them, improvements in clinicians' skills and willingness to use IT have all contributed toward raising data quality to the point where routinely collected clinical data can be used for research. Routinely collected primary care computer data, aggregated into large databases, can be used for audit, quality improvement—especially chronic disease management, health service planning, epidemiological study and research. As our understanding of genetics grows the data can also be used to study the environmental–genetic balance in disease.

However, there are gaps in the literature about how to find relevant data; select appropriate research methods for its analysis, and ensure that the correct inferences are drawn. This paper sets out to fill some of these gaps and develop the health informatics evidence base.

The series editors: Professors Frank Sullivan and Azeem Majeed suggested the idea for the paper and have made helpful suggestions throughout its development.

References

1

Pickles W. Epidemiology in a country practice. (First published 1939). London: RCGP;

1994
.

2

Huygen FJA. Family Medicine. London: RCGP;

1983
.

3

Fry J. General Practice: The Facts. Oxford: Radcliffe Medical Press;

1992
.

4

Zuboff S. In the Age of the Smart Machine: The Future of Work and Power. New York: Basic Books;

1988
.

5

Sullivan F. What is health informatics?

J Health Serv Res Policy
2001
;
6
:
251
–254.

6

Allen J, Gay B, Crebolder H et al. The European definitions of the key features of the discipline of general practice: the role of the GP and core competencies.

BJGP
2002
;
52
:
526
–527.

7

de Lusignan S. What is primary care informatics?

J Am Med Inform Assoc
2003
;
10
:
304
–309.

8

de Lusignan S, Teasdale S, Little D et al. Comprehensive computerised primary care records are an essential component of any national health information strategy: report from an international consensus conference.

Inform Prim Care
2004
;
12
:
255
–264.

9

Ash JS, Bates DW. Factors and forces affecting EHR system adoption: report of a 2004 ACMI discussion.

J Am Med Inform Assoc
2005
;
12
:
8
–12.

10

Hasvold T. A computerized medical record. ‘The Balsfjord system’.

Scand J Prim Health Care
1984
;
2
:
125
–128.

11

Krogh-Jensen P. Electronic records for general practice. The Danish system.

Scand J Prim Health Care.
1984
;
2
:
121
–123.

12

Knottnerus JA. Role of the electronic patient record in the development of general practice in The Netherlands.

Methods Inf Med
1999
;
38
:
350
–354.

13

Benson T. Why British GPs use computers and hospital doctors do not.

Proc AMIA Symp
2001
;
42
–46.

14

Taylor H, Leitman R. European physicians especially in Sweden, Netherlands and Denmark, Lead U.S. in use of electronic medical records. Harris Interactive: Healthcare News

2002
;
2
:
1
–3. Available at: http://www.harrisinteractive.com/news/newsletters/healthnews/HI_HealthCareNews2002vol2_Iss16.pdf.

15

Berner ES, Detmer DE, Simborg D. Will the wave finally break? A brief view of the adoption of electronic medical records in the United States.

J Am Med Inform Assoc.
2005
;
12
:
3
–7.

16

Tomlin A, Hall J. Linking primary and secondary healthcare databases in New Zealand.

N Z Med J
2004
;
117
:
U816
.

17

Palomo L, Gervas J, Garcia-Olmos L. The frequency of illnesses attended and its relationship with the maintenance of the family doctor's skill—Article in Spanish.

Aten Primaria
1999
;
23
:
363
–370.

18

Mitchell E, Sullivan F, Grimshaw JM, Donnan PT, Watt G. Improving management of hypertension in general practice: a randomized controlled trial of feedback derived from electronic patient data.

Br J Gen Pract
2005
;
55
:
94
–101.

19

de Lusignan S, Hague N, Brown A, Majeed A. An educational intervention to improve data recording in the management of ischaemic heart disease in primary care.

J Public Health (Oxf)
2004
;
26
:
34
–37.

20

Pringle M, Hobbs R. Large computer databases in general practice.

BMJ
1991
;
312
:
741
–742.

21

Black NA. Developing high-quality clinical databases. The key to a new research paradigm.

BMJ
1997
;
315
:
831
–832.

22

Chapman WW, Christensen LM, Wagner MM et al. Classifying free-text triage chief complaints into syndromic categories with natural language processing.

Artif Intell Med
2005
;
33
:
31
–40.

23

Chen L, Friedman C. Extracting phenotypic information from the literature via natural language processing.

Medinfo
2004
;
2004
:
758
–762.

24

Liu H, Friedman C. CliniViewer: a tool for viewing electronic medical records based on natural language processing and XML.

Medinfo
2004
;
2004
:
639
–643.

25

Janetzki V, Allen M, Cimino JJ. Using natural language processing to link from medical text to on-line information resources.

Medinfo
2004
;
2004
:
1665
.

26

Hsieh Y, Hardardottir GA, Brennan PF. Linguistic analysis: terms and phrases used by patients in e-mail messages to nurses.

Medinfo
2004
;
2004
:
511
–515.

27

University of North Carolina. International Classification of Disease. Available at: http://www.cpc.unc.edu/dataarch/nhealth/icd.html.

28

Rector AL. Clinical terminology: why is it so hard?

Methods Inf Med
1999
;
38
:
239
–252.

29

Open Clinical Glossary. Available at: http://www.openclinical.org/glossary.html.

30

Bentzen N. (ed). WONCA Dictionary of General/Family Practice. Copenhagen: Maanedsskrift for Praktisk Laegegerning;

2003
.

31

Australian Standard. The language of health concept representation. Sydney: Standards Australia;

2005
.

32

Wilson R, Purves I. Coding and Nomenclatures: A Snapshot from around the world. Available at: http://www.cs.man.ac.uk/mig/links/RCSEd/coding-use-snapshot.htm.

33

Lagasse R, Desmet M, Jamoulle M et al. European situation of the routine medical data collection and their utilisation for health monitoring: Euro-med-data Final Report;

2001
. Available at: http://www.ulb.ac.be/esp/emd/Emd_rep.pdf.

34

Deckers J, Schellevis F. Health Information from Primary Care. Final report. Utrecht: Netherlands Institute for Health Services Research (NIVEL). Available at: http://www.nevel.nl.

35

de Lusignan S, Minmagh C, Kennedy J, Zeimet M, Bommezijn H, Bryant J. A survey to identify the clinical coding and classification systems currently in use across Europe.

Medinfo
2001
;
10
:
86
–89.

36

ICPC-2. International classification of primary care an introduction. Available at: http://www.ulb.ac.be/esp/wicc/icpc2.html.

37

NHS Information Authority. SNOMED CT (Systematized Nomenclature of Medicine—Clinical Terms.) Available at: www.nhsia.nhs.uk/snomed/.

38

National Centre for Classification in Health (NCCH). Australian Modification of the World Health Organization ICD-10-AM. Available at: http://www3.fhs.usyd.edu.au/ncch/.

39

Britt H. A new coding tool for computerised clinical systems in primary care—ICPC plus.

Aust Fam Physician.
1997
;
26
(Suppl. 2):
s79
–s82.

40

O'Halloran J, Miller GC, Britt H. Defining chronic conditions for primary care with ICPC-2.

Fam Pract
2004
;
21
:
381
–386.

41

de Lusignan S, Stephens PN, Adal N, Majeed A. Does feedback improve the quality of computerized medical records in primary care?

J Am Med Inform Assoc
2002
;
9
:
395
–401.

42

Scobie S, Basnett I, McCartney P. Can general practice data be used for needs assessment and health care planning in an inner-London district?

J Public Health Med
1995
;
17
:
475
–483.

43

de Lusignan S, Dzregah B, Hague N, Chan T. Cholesterol management in patients with ischaemic heart disease: an audit-based appraisal of progress towards clinical targets in primary care.

Br J Cardiol
2003
;
10
:
223
–228.

44

Carey IM, Cook DG, De Wilde S et al. Implications of the problem orientated medical record (POMR) for research using electronic GP databases: a comparison of the Doctors Independent Network Database (DIN) and the General Practice Research Database (GPRD).

BMC Fam Pract
2003
;
4
:
14
.

45

Majeed A. Sources, uses, strengths and limitations of data collected in primary care in England. Health Statistics Quarterly

2004
;
21
:
23
–29. Available at: http://www.statistics.gov.uk/downloads/theme_health/HSQ21.pdf.

46

Horsfield P, Teasdale S. Generating information from electronic patient records in general practice: a description of clinical care and gender inequalities in coronary heart disease using data from over two million patient records.

Inform Prim Care
2003
;
11
:
137
–144.

47

van Weel C. Longitudinal research and data collection in primary care.

Ann Fam Med
2005
;
3
(Suppl. 1):
s46
–s51.

48

van Weel C, Smith, H., Beasley, JW. Family practice research networks. Experience from three countries.

J Fam Pract
2000
;
49
:
938
–943.

49

de Grauw WJ, van de Lisdonk EH, van den Hoogen HJM, van Weel C. Cardiovascular morbidity and mortality of type 2 diabetes patients.

Diabet Med
1995
;
12
:
117
–122.

50

Weel-Baumgarten EM van, Bosch WJHM van den, Hoogen HJM van den, Zitman FG. Ten year follow-up of depression after diagnosis in general practice.

Br J Gen Pract
1998
;
48
:
1643
–1646.

51

Brown SH, Lincoln MJ, Groen PJ, Kolodner RM. VistA—U.S. Department of Veterans Affairs national-scale HIS.

Int J Med Inform
2003
;
69
:
135
–156.

52

Penz JF, Brown SH, Carter JS et al. Evaluation of SNOMED-CT coverage of veterans health administration terms.

Medinfo
2004
;
2004
:
540
–544.

53

National Centre for Health Statistics (NCHS). National Ambulatory Medical Care Survey (NAMCS). Available at: http://www.cdc.gov/nchs/about/major/ahcd/namcsdes.htm.

54

Australian General Practice Statistics and Classification Centre (AGPSCC). The BEACH (Bettering the Evaluation And Care of Health) Project. Availbale at: http://www.fmrc.org.au/beach.htm.

55

New Zealand Ministry of Health. The National Primary Medical Care Survey (NatMedCa). Available at: http://www.moh.govt.nz/natmedca.

56

First WH. Healthcare in the 21st Century.

N Engl J Med
2005
;
352
:
267
–272.

57

Whitelaw FG, Nevin SL, Milne RM, Taylor RJ, Taylor MW, Watt AH. Completeness and accuracy of morbidity and repeat prescribing records held on general practice computers in Scotland.

Br J Gen Pract
1996
;
46
:
181
–186.

58

Herbert, CP. Future of research in family medicine: where to from here?

Ann Fam Med
2004
;
2
(Suppl. 2):
s60
–s64.

59

Roland M. Linking physicians' pay to the quality of care—a major experiment in the United Kingdom.

N Engl J Med
2004
;
351
:
1448
–1454.

60

van Weel C. Validating long term morbidity recording.

J Epidemiol Commun Health
1995
;
49
(Suppl. 1):
29
–32.

61

van Weel-Baumgarten EM, van den Bosch WJHM, van den Hoogen HJM, Zitman FG The validity of the diagnosis of depression in general practice: is using criteria for diagnosis as a routine the answer?

Br J Gen Pract
2000
;
50
:
284
–287.

62

Gill JM, Dansky BS. Use of an electronic medical record to facilitate screening for depression in primary care. prim care companion.

J Clin Psychiatry
2003
;
5
:
125
–128.

63

Gill JM, Ewen E, Nsereko M. Impact of an electronic medical record on quality of care in a primary care office.

Del Med J.
2001
;
73
:
187
–194.

64

Adams WG, Mann AM, Bauchner H. Use of an electronic medical record improves the quality of urban pediatric primary care.

Pediatrics
2003
;
111
:
626
–632.

65

Sullivan FM, McEwan N, Murphy G. Regional repositories, reintermediation and the new GMS contract: cardiovascular disease in Tayside.

Inform Prim Care
2003
;
11
:
215
–221.

66

National Health Service. NHS Number. Available at: http://www.nhsia.nhs.uk/nnp/pages/default.asp.

67

Open Source Initiative (OSI). Available at: http://www.opensource.org/.

68

Health Level Seven (HL7). Available at: http://www.hl7.org/.

69

Emery J, Hayflick S. The challenge of integrating genetic medicine into primary care.

BMJ
2001
;
322
:
1027
–1030.

70

Emery J, Lucassen A, Murphy M. Common hereditary cancers and implications for primary care.

Lancet
2001
;
358
:
56
–63.

71

Lowrance WW. The promise of human genetic databases.

BMJ
2001
;
322
:
1009
–1010.

72

Sullivan FM, Pell JP, Sweetland M, Morris AD. How could primary care meet the informatics needs of UK Biobank? A Scottish proposal.

Inform Prim Care
2003
;
11
:
129
–135.

73

Hippisley-Cox J, Hammersley V, Pringle M, Coupland C, Crown N, Wright L. How useful are General Practice Databases for research? analysis of the accuracy and completeness in one research network.

Health Inform J
2004
;
10
:
91
–109.

74

Fleming DM. The denominator for audit in general practice.

Fam Pract
1985
;
2
:
76
–81.

75

Millett C, Bardsley M, Binysh K. Exploring the effects of population mobility on cervical screening coverage.

Public Health
2002
;
116
:
353
–360.

76

Robson J, Falshaw M. Audit of preventive activities in 16 inner London practices using a validated measure of patient population, the ‘active patient’ denominator. Healthy Eastenders Project.

Br J Gen Pract
1995
;
45
:
463
–466.

77

Juncosa S, Bolibar B. Measuring morbidity in primary care—Article in Spanish.

Aten Primaria
2001
;
28
:
602
–607.

78

Falkoe E, Rasmussen KB, Maclure M, Schroll H. Statistical linkage of treatment to diagnosis for research and monitoring of practice patterns.

Methods Inf Med
2004
;
43
:
282
–286.

79

Bayegan E, Nytro O, Grimsmo A. Ranking of information in the computerized problem-oriented patient record.

Medinfo
2001
;
10
:
594
–598.

80

Le Pont F, Letrilliart L, Massari V, Dorleans Y, Thomas G, Flahault A. Suicide and attempted suicide in France: results of a general practice sentinel network, 1999–2001.

Br J Gen Pract
2004
;
54
:
282
–284.

81

de Grauw W, van de Lisdonk EH, van den Hoogen HJM, van Weel C. Cardiovascular morbidity and mortality of type 2 diabetes patients.

Diabetic Med
1995
;
12
:
117
–122.

82

Jordan K, Porcheret M, Croft P. Quality of morbidity coding in general practice computerized medical records: a systematic review.

Fam Pract
2004
;
21
:
396
–412.

83

Tierney WM, McDonald CJ. Practice databases and their uses in clinical research.

Stat Med
1991
;
10
:
541
–557.

84

de Lusignan S, Valentin T, Chan T et al. Problems with primary care data quality: osteoporosis as an exemplar.

Inform Prim Care
2004
;
12
:
147
–156.

85

Letrilliart L, Guiguet M, Flahault A. Reliability of report coding of hospital referrals in primary care versus practice-based coding.

Eur J Epidemiol
2000
;
16
:
653
–659.

86

Cricelli C, Mazzaglia G, Samani F et al. Prevalence estimates for chronic diseases in Italy: exploring the differences between self-report and primary care databases.

J Public Health Med
2003
;
25
:
254
–257.

87

Soler JK. Sick leave certification: a unique perspective on frequency and duration of episodes—a complete record of sickness certification in a defined population of employees in Malta.

BMC Fam Pract
2003
;
4
:
2
.

88

Garside P. Are we suffering from change fatigue?

Qual Saf Health Care
2004
;
13
:
89
–90.

89

Axelsson R. The organizational pendulum—healthcare management in Sweden 1865–1998.

Scand J Public Health
2000
;
28
:
47
–53.

90

College of General Practitioners. A classification of disease.

J Coll Gen Pract
1959
;
2
:
140
–159.

91

de Grauw WJ, van Gerwen WH, van de Lisdonk EH, van den Hoogen HJ, van den Bosch WJ, van Weel C. Outcomes of audit-enhanced monitoring of patients with type 2 diabetes.

J Fam Pract
2002
;
51
:
459
–464.

92

Carr-Hill RA, Rice N, Roland M. Socioeconomic determinants of rates of consultation in general practice based on fourth national morbidity survey of general practices.

BMJ
1996
;
312
:
1008
–1012.

93

Lowrance W. Learning from experience: privacy and the secondary use of data in health research.

J Health Serv Res Policy
2003
;
8
(Suppl 1):
2
–7.

94

Bolton Research Group. Patients' knowledge and expectations of confidentiality in primary health care: a quantitative study.

Br J Gen Pract
2000
;
50
:
901
–902.

95

Fletcher J, Marriott J, Phillips D. Data protection, informed consent, and research interpretation of legislation should reflect patients' views.

BMJ
2004
;
328
:
1437
.

96

Classification Committee of WONCA. An international classification of health problems in primary care (ICHPPC). London: The Royal College of General Practitioners, Occasional paper 1; December

1976
.

97

World Health Organisation. Manual of the International Statistical classification of diseases injuries and causes of Death. (8th Revision). Geneva: WHO;

1967
.

98

Lamberts H, Meads S, Wood M. Classification of reasons why person seek primary care: pilot study of a new system.

Public Health Rep
1984
:
99
:
597
–605.

99

Classification Committee of WONCA. International Classification of Process in Primary Care (IC-process-PC). Oxford: oxford University Press;

1986
.

100

WONCA (World Organisation of National Colleges and Academies of Family Physicians). International Classification for Primary Care (ICPC-2). Available at: http://www.ulb.ac.be/esp/wicc/icpc2.html.

101

Britt H. Miller G. ICPC-Plus. An extended version of the International Classification for Primary Care for computerised clinician systems. London: British Computer Society, Proceedings of the annual conference of the Primary Healthcare Specialist Group;

1996
. 107–112.

102

Robinson D. Wanger K, Price C. The clinical terms and ICPC: Identifying equivalence and enabling compatibility for ‘Non-medical’ terms. British Computer Society, Primary Care Specialist Group Annual Conference 1998. Available at: http://phcsg.ncl.ac.uk/conferences/cambridge1998/robinson.htm.

103

O'Neil M, Payne C, Read JD. Read codes version 3-a user led terminology.

Methods Inf Med
1995
;
34
:
187
–192.

104

Stearns MQ, Price C, Spackman KA, Wang AY. SNOMED clinical terms: overview of the development process and project status. In Bakken S (ed.) Proceedings of the 2001 AMIA Fall Symposium. Philadelphia: Hanley and Belfus;

2001
: 662–666.

105

NHS Information Authority. Global Standard for Healthcare Terminology. Birmingham: NHSIA;

1999
. Available at: http://www.nhsia.nhs.uk/snomed/pages/publications.asp.

Author notes

aPrimary Care Informatics, Division of Community Health Sciences, St George's Hospital Medical School, London SW17 ORE, UK and bChair department of family medicine, Department of General Practice, University Medical Centre Nijmegen, 229-HAG, PO Box 9101, 6500 HB Nijmegen, The Netherlands

Comments

2 Comments
Response to Dr Soler et al.,
31 July 2006
Simon de Lusignan (with Chris van Weel)
Senior Lecturer, St George's, University of London

Dear Editor, Thank-you for the letter of Dr Soler et al., we will clarify the points they raise. Although they appear to criticise our review of the International Classification of Primary Care (ICPC), we did recognise their work and referenced their publications. Firstly our description of the ICPC fits exactly with the authors of the letter; only we use the phrase “diagnostic and therapeutic procedure” where they use the term “interventions”.

We described ICPC as allow recording of: “..reason for encounter, diagnostic and therapeutic procedures and diagnoses.” The authors of the letter as: “…reasons for encounter, the diagnoses, and the interventions.”

We think that it is useful that the letter highlights projects that have used ICPC in their second paragraph and how it is being developed through much of the rest of their letter. In our paper we had to draw a line at the level of detail that we could allow to the many databases we reviewed and it is a bit unfair of the letter writers to interpret this as undervaluing of ICPC. One of us (SdL) has co-organised with colleagues from the Netherlands a primary care study the day ahead of the EFMI (European Federation for Medical Informatics) conference in Maastricht in August – to which the authors of this letter would be most welcome and from which they may judge the importance we attribute to the development of ICPC .

The scope of our paper was opportunities and challenges for researchers using computer data currently available for research. Whilst we welcome the development of newer versions of ICPC the letter’s authors fully admit that it will take many years for these to become fully operationalised. We feel that these developments were not sufficiently concrete to merit inclusion; such work in progress is difficult to include in a review, though we accept that the authors may feel otherwise. Whilst we respect their advocacy for ICPC and look forward to their work on ICPC coming to fruition and expanding the informatics evidence-base we feel we have given a balanced account of the state of the art. Yours Sincerely, Simon de Lusignon Chris van Weel

Conflict of Interest:

None declared

Submitted on 31/07/2006 4:29 AM GMT
Outlining errors and inaccuracies in this review
18 July 2006
Jean K Soler (with Inge M. Okkes)
Family Doctor, The Family Practice, Attard BZN04, Malta

De Lusignan and Van Weel’s contribution to the Medical Informatics Review Series in the March 2006 issue of FP – an overview on an extremely important topic – was unfortunately rather disappointing.1 The article is incomplete and unbalanced in areas, and contains some factual errors, especially (Boxes 1 and 3; Table 3) 1 with regard to describing and assessing the International Classification of Primary Care (ICPC). 2-4

It is important for those of us who are involved with the maintenance and development of ICPC that we set the record straight by listing these inaccuracies and errors. We list some of the most important ones below.

1. ICPC has not been developed as and is not a diagnostic classification for encounter data.5-8 ICPC is meant to document and code episodes of care over time, characterizing the changing relations between the three elements coded for each and every encounter within episodes of care: i.e. the reasons for encounter, the diagnoses, and the interventions.

2. At least five substantial databases, routinely collected from family doctors’ practices, exist that showcase the comprehensive use of ICPC. These have been collected during the Transition Project from: The Netherlands (21 years), Japan (3 years), Poland (five years), Serbia (now 2 years), and Malta (now 5 years).5, 7, 9, 10 Only the Dutch one is mentioned in table 3, although with an incorrect reference (the complete database can be downloaded from http://www.transitieproject.nl ). The five other projects involving the use of ICPC that are mentioned in table 3 1 are based on a use of ICPC for which this classification was not originally developed. The authors should have highlighted this fact, should have discussed the effects of that circumstance on the quality of the data, and could have proffered their thoughts on the utility of (in that sense) incomplete data. In fact, they could have discussed whether in such cases, where ICPC is not used comprehensively, ICD-10 would not have been the preferred classification.

3. The presented bibliographic data on ICPC are riddled with inaccuracies. 1 The authors do not properly refer to ICPC-1, ICPC-2, or to any of its electronic updates, and to top it off, they even do not mention the publication of its most recent update, ICPC-2-R in 2005. 2-4 The latter 4 was presented at the Wonca Asian Pacific meeting in Kyoto, Japan, in May 2005, and it seems incomprehensible that the President-Elect of Wonca, one of the authors, could have missed this. This revision is accompanied by a CD-Rom containing a host of information on the use of ICPC, and data collected using ICPC. 4, 5 This one source should have prevented de Lussignan and Van Weel from making their sweeping statement that ‘the current standards of data collection in family practice are insufficient’.

4. On the CD-Rom mentioned above, 5 an ICPC2-ICD10 Thesaurus, with more than 75,000 diagnostic terms and intended for the use of automatic double-coding within electronic patient records, has been made available. This Thesaurus is included in the Metathesaurus of the National Library of Medicine in the United States of America, and is (since version 2005AA) mapped to the diagnostic concepts in Snomed-CT. Earlier, the same ICPC2/ICD10 structure was mapped to Clinical Terms version 3 (CTV-3). De Lussignan and Van Weel’s discussion concerning the need for granularity in clinical records, and on the strategies to address recording free text could have greatly benefited from addressing these realities.

5. The authors’ remark on chapter Z is erroneous, 1 as it does not reflect on chapter Z in ICPC, but rather on chapter z of ICD10, the latter which will probably be eventually considered for a more detailed mapping to the process codes of ICPC-3.

It is our duty to add that the authors give incorrect information about the status of the revision of ICPC-2 towards ICPC-3/ICD-11 within the context of the World Health Organisation’s Family of International Classifications (WHO-FIC). This process is in its infancy, and it is expected to take at least five to ten years before both will be available in an operational form. It is to be expected that the cooperation between Wonca and WHO with regard to the mutual revisions of both classifications will also take considerable time. It seems evident then, that users of ICPC-2-R, whether they use it comprehensively or not, can be quite confident that this classification will be the state of the art for the next ten years.

References

1. De Lusignan S, Van Weel C. The use of routinely collected computer data for research in primary care: opportunities and challenges. Fam Pract 2006;23:253-63. 2. Lamberts H, Wood M, eds. ICPC. International Classification of Primary Care. Oxford: Oxford University Press, 1987. 3. ICPC-2. International Classification of Primary Care, Second edition. Oxford: Oxford University Press, 1998. 4. ICPC-2-R. International Classification of Primary Care. Revised Second Edition. Oxford: Oxford University Press, 2005. 5. Okkes IM, Oskam SK, Lamberts H. ICPC in the Amsterdam Transition Project. (CD-Rom). Amsterdam: Academic Medical Center/University of Amsterdam, Department of Family Medicine, 2005. 6. Okkes IM, Becker HW, Bernstein RM, Lamberts H. The March 2002 update of the electronic version of ICPC-2. A step forward to the use of ICD-10 as a nomenclature and a terminology for ICPC-2. Fam Pract 2002;19:543-6. 7. Okkes IM, Polderman GO, Fryer GE, Yamada T, Bujak M, Oskam SK, Green LA, Lamberts H. The role of family practice in different health care systems. A comparison of reasons for encounter, diagnoses, and interventions in primary care populations in the Netherlands, Japan, Poland, and the United States. J Fam Pract 2002; 51(1):72-3. 8. Okkes IM, Lamberts H. Classification and the domain of family practice. In: Jones R, ed. The Oxford Textbook of Primary Medical Care. Oxford: Oxford University Press, 2003. Vol 1: 139-52. 9. Soler JK, Okkes IM. Sick leave certification: an unwelcome administrative burden for the family doctor? The role of sickness certification in Maltese family practice. Eur J Gen Pract 2004;10:50-5. 10. Electronic source: http://www.phckraljevo.org (Accessed May 2006).

Dr. Jean Karl Soler, Attard, Malta Dr. Inge Okkes, Amsterdam, The Netherlands

Conflict of Interest:

None declared

Submitted on 18/07/2006 1:50 AM GMT