Introduction The COVID-19 pandemic caused by SARS-CoV-2 places immense worldwide demand on healthcare services. Earlier identification of patients at risk of severe disease may allow intervention with experimental targeted treatments, mitigating the course of their disease and reducing critical care service demand.
Methods and analysis This prospective observational study of patients tested or treated for SARS-CoV-2, who are under the care of the tertiary University Hospital Southampton NHS Foundation Trust (UHSFT), captured data from admission to discharge; data collection commenced on 7 March 2020. Core demographic and clinical information, as well as results of disease-defining characteristics, was captured and recorded electronically from hospital clinical record systems at the point of testing. Manual data were collected and recorded by the clinical research team for assessments which are not part of the structured electronic healthcare record, for example, symptom onset date. Thereafter, participant records were continuously updated during hospital stay and their follow-up period. Participants aged >16 years were given the opportunity to provide consent for excess clinical sample storage with optional further biological sampling. These anonymised samples were linked to the clinical data in the Real-time Analytics for Clinical Trials platform and were stored within a biorepository at UHSFT.
Ethics and dissemination Ethical approval was obtained from the HRA Specific Review Board (REC 20/HRA/2986) for waiver of informed consent for the database-only cohort; the procedures conform with the Declaration of Helsinki. The study design, protocol and patient-facing documentation for the biobanking arm of the study have been approved by North West Research Ethics Committee (REC 17/NW/0632) as an amendment to the National Institute for Health Research Southampton Clinical Research Facility-managed Southampton Research Biorepository. This study will be published as peer-reviewed articles and presented at conferences, presentations and workshops.
- health informatics
- protocols & guidelines
- chronic airways disease
- respiratory infections
- respiratory medicine (see thoracic medicine)
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
- health informatics
- protocols & guidelines
- chronic airways disease
- respiratory infections
- respiratory medicine (see thoracic medicine)
Strengths and limitations of this study
Close alignment of research and clinical practice in a near real-time manner.
Longitudinal data collection and sampling opportunities.
A single-centre study with data collection reflective of clinical need rather than a strict protocolised time frame.
Use of novel artificial intelligence techniques for data analysis.
The outbreak of COVID-19 caused by the novel SARS-CoV-2 was declared a pandemic by the WHO on 11 March 2020.1 In the UK, 1 629 657 people have tested positive for SARS-CoV-2, of which 66 713 have died (https://www.gov.uk/guidance/coronavirus-covid-19-information-for-the-public; correct as of 17:00, 30 November 2020).2 So far, emerging effective treatments such as remdesivir and dexamethasone have only modest effects on clinical outcomes in a subcohort of patients,3 4 and any vaccines have yet to demonstrate their efficacy. While many patients recover from SARS-CoV-2 infection without need for hospitalisation, a small proportion go on to develop severe disease. The demand on healthcare services, especially critical care, is immense. Earlier identification of those patients at risk of severe disease may allow intervention with experimental targeted treatments mitigating the course of their disease and reducing demand on critical care services.
The need for rapid translation of treatments from development to clinical application is being addressed by urgent public health studies such as ACCORD-25 and RECOVERY trials.6 Despite the urgency to enrol patients in these studies, there are challenges in identifying suitable patients in a fluid, ‘real-time’ clinical environment within a novel disease. The longitudinal Research Evaluation Alongside Clinical Treatment in COVID-19 (REACT COVID-19) database and biosampling study captures the natural history of SARS-CoV-2 infection and its clinical response to current interventions, as well as characterises specific phenotypes and endotypes, with the potential to rapidly identify patients for novel treatment trials.
The Research Evaluation Alongside Clinical Treatment observational and biobanking study of COVID-19 (REACT COVID-19 study) evolved around a collaboration between the University of Southampton (UoS), University Hospital Southampton NHS Foundation Trust (UHSFT) and the digital Experimental Cancer Medicine Team (ECMT) and their work to create the REACT platform. The REACT platform is used for the determination of early clinical benefit of new cancer medicines in early-phase drug trials and has been adapted by the digital ECMT in collaboration with UHSFT to allow rapid upload and interpretation of clinical data in the context of the COVID-19 pandemic. This study is supported by the National Institute for Health Research (NIHR) Southampton Clinical Research Facility (CRF) and NIHR Southampton Biomedical Research Centre (BRC), which takes advantage of the expertise and resource in hosting large prospective cohort studies.
Development of a real-time ‘parent’ database of well-characterised patients with COVID-19 to facilitate a better understanding of the natural history of SARS-CoV-2 infection from prehospital through to level 3 care and subsequent course of COVID-19-related complications from discharge through to clinic follow-up.
Cross-sectional phenotypical characterisation of a cohort of patients with COVID-19 to describe the disease heterogeneity seen in clinical practice.
Identify potential risk factors for progression to severe disease measured by admission to level 2 or 3 care, increase in respiratory support or death (primary endpoint).
Use of samples stored in the biorepository (eg, blood, urine and sputum) to develop an endotype-level understanding of disease clusters.
Assess the efficacy of current best practice management strategies.
Identify subgroups of patients for novel treatment strategies in the form of independent formal randomised control trials.
Methods and analysis
The REACT COVID-19 study is a prospective observational study of patients tested or treated for SARS-CoV-2 infection who are under the care of UHSFT. The study is planned as a long-term research project with no limit to enrolment nor any time-defined end, with data collection commencing on 7 March 2020 with admission of the first SARS-CoV-2 patient to University Hospital Southampton (UHS). Patients are recruited into the study at the point of testing for SARS-CoV-2 at UHSFT. Core demographic and clinical information, as well as results of disease-defining characteristics, are captured and recorded electronically from hospital clinical record systems at the point of testing. Manual data are collected and recorded by the clinical research team for assessments which are not part of the electronic healthcare record, for example, symptom onset date. Thereafter, participants’ records are continuously updated during their hospital stay and for 12 months following discharge from acute admission, in their follow-up period under UHSFT. This pragmatic, opportunistic approach to data collection takes full advantage of the electronic healthcare records and seamlessly links with the REACT platform to reduce data collection burden on the patient, clinician or researcher. In addition, participants over the age of 16 years old are given the opportunity to provide consent for storage of excess sample taken as part of their routine clinical management with optional further biological sampling. These anonymised samples are linked to the anonymised clinical data in the REACT platform and stored within the Southampton Research Biorepository (SRB), managed by the NIHR CRF at UHSFT.
Patient and public involvement
Patient representatives were involved in and were part of the governance structure of the SRB, and patient representatives were involved in the design and ongoing management of the WATCH study, on which much of this observational biobank study was based. The SRB Oversight Committee has a lay representative to work alongside other members and may jointly be consulted in the different matters related to the construction and running of the biorepository.
UHSFT is an 1100-bed tertiary centre and, as such, has assessed and admitted a large number of SARS-CoV-2-positive patients.
All patients under the care of UHSFT who are tested or treated for SARS-CoV-2 are included in the REACT COVID-19 study. A subcohort of patients aged ≥16 years old are given the opportunity to provide consent for storage of excess sample taken as part of their routine clinical management with optional further biological sampling. They are offered the opportunity to provide this at any point during their care under UHSFT. So far, UHSFT has admitted 629 patients with confirmed COVID-19, correct as of 17 June 2020. Study enrolment will be continual into the future, aligned to the ongoing function of the clinical service with no discrete recruitment target or endpoint.
This study is divided into two parts and is conducted in conjunction with standard clinical care. The database component involves the assimilation of data collected as part of routine clinical care for patients of any age tested for or admitted to UHSFT for suspected SARS-CoV-2 infection. Core demographic and clinical data collected as part of standard clinical care will be extracted from the electronic hospital records. Where required, additional data relevant to COVID-19 clinical course is manually extracted from the clinical notes and uploaded to the database by clinical researchers where automatic electronic export is not possible. Data collected from study participants is collated in the highly secure, contemporary and encrypted data platform, BC|INSIGHT (Microsoft Data Centre, UK). Data are then uploaded to the REACT platform to allow rapid capture of the natural history of the disease, with clinicians able to visualise trends in patient trajectories and to identify patients who may be suitable for approach for intervention studies (summarised in figure 1.).
Data captured at point of testing for SARS-CoV-2 includes core demographic and clinical data relevant to COVID-19. Subsequent data capture follows clinical disease course and includes information on medication, vital signs, and pathological and radiological measurements. These data are captured in real time during the follow-up period under UHSFT care, including any period of hospitalisation, and for 12 months after admission to capture virtual follow-up or outpatient clinic follow-up data (summarised in figure 2). Any treatment interventions as part of routine clinical care are also captured (where possible electronically), and these form part of the visualisations in the REACT platform. Data for research purposes will be stored in an anonymised format.
The majority of data collected is consistent with the data already being collected as part of routine clinical care of patients with COVID-19, therefore seamlessly linking with clinical practice. Clinical data to be captured are listed in table 1. The REACT database will host a clinically identifiable ‘front end’ for real-time patient management during the acute admission and for up to 12 months postdischarge, in which the link anonymised data will be made identifiable and visible only to the clinical team as determined by a statement of reference. Data for research purposes will be stored in an anonymised format.
A subcohort of patients aged ≥16 years old are given the opportunity to provide consent for storage of excess sample taken as part of their routine clinical management with optional further biological sampling (ie, blood, urine, induced sputum, nasal samples, exhaled breath or bronchial wash samples; see table 2). They are offered the opportunity to provide this at any point during their care under UHSFT. Care is taken to offer this study to participants after or alongside recruitment to the national NIHR Urgent Public Health Priority observational studies (International Severe Acute Respiratory and Emerging Infection Consortium (ISARIC), Diagnosis and Management of Febrile Illness using RNA Personalised Molecular Signature Diagnosis (DIAMONDS) and NIHR Bioresource). These samples are stored within the SRB (managed under the UoS Human Tissue Act Licence) and analysed accordingly to enable biomarker measures related to COVID-19, including genomics, transcriptomics, proteomics, lipidomics and biochemical studies. These measures and the clinical data are reciprocally iterative and inform knowledge as they feed into the clinical care in determining treatment pathways and the response to treatment, as well as the natural history of SARS-CoV-2 infection. Samples are stored as coded samples, without participant names, that can be linked to clinical data (linked anonymised).
Data are captured longitudinally, with change over time treated as explicit. After admission, data capture will occur longitudinally in parallel with participant ongoing clinical follow-up for up to 12 months of their discharge following their acute COVID-19 admission. Active participation in the database would end 12 months after the participant was discharged from UHSFT following their acute admission with COVID-19 19 or if there was no further involvement of UHSFT clinical teams in their care before this time. The Database will continue to aquire data until no further patients are seen at UHSFT with COVID-19 19 and all patients have been discharged from UHSFT care. Storage of data for research purposes will be in an anonymised format. Clinical and research data collected from the cohort in the study are kept in a highly secure contemporary encrypted data platform BC|INSIGHT (within the Clinical Informatics Research Unit, UoS) that was set up in a Microsoft Data Centre in South UK.
The majority of data capture is electronic, and therefore we expect missing data to be minimal. For all data other than symptom onset, any missing data will be characterised as ‘missing’ and will be excluded from analysis. In keeping with the data capture running in parallel with clinical care rather than via independent research ‘visits’, the data collection may not include certain variables for all patients if the capture of that information or a sample subset was not clinically indicated.
For symptom onset date, which is used for the purposes of analysis as day 0, a valid date is necessary. Data will be extracted manually from the medical records and will be classified as the following:
Unclear: in this scenario, admission date will be used.
Nosocomial infection (acquired while in the hospital): date of first +swab will be used if symptom onset date is not otherwise available.
These three groups will be identified within the dataset and analysed for significant differences prior to assessment as a whole cohort for a specific outcome.
Analysis of the whole dataset will be both cross-sectional and longitudinal according to specific study questions. Specific datasets can be extracted from the database to investigate specific hypothesis.
This cohort can also be interrogated for inclusion criteria towards additional trials occurring in parallel to the REACT COVID-19 study. Participation in clinical trials will not exclude patients from the cohort but may omit patients from some analysis depending on the research question.
A variety of statistical methods will be employed to support a range of assessment needs. Associations to potential risk factors will also be tested via multiple logistic regression analyses. Statistical significance levels will be set at a p value of <0.05.
Cluster analysis methodology will also be used to determine the heterogeneous nature of COVID-19 through an unbiased approach. Tests of correlation will be used to ensure that excessive collinearity of cluster variables does not bias the clustering process. Associations of disease clusters/phenotypes with cluster variables across resulting clusters will use analysis of variance for continuous variables and χ2 tests for binary variables. Subsequent tests of association to identify patterns of disease morbidity for disease clusters will be conducted. Further tests of association for disease clusters with potential epidemiological and pathophysiological risk factors will also be undertaken.
To facilitate artificial intelligence and machine intelligence learning from this dataset, the following will be undertaken:
Mapping summary statistical data from existing intensive care unit studies (Observational Health Data Sciences and Informatics and similar), as well as demographics data, using these to contextualise and enrich the patterns observed for participants with COVID-19 at UHSFT.
The contextualisation component would use existing studies as a reference frame to contrast and compare similarities and differences among UHSFT and external patient cohorts.
The enrichment component would use existing studies and demographics as priors to support Bayesian inference over UHSFT data.
Codesign and evaluate a Bayesian inference model.
With a possible extension of the Bayesian framework into a causal inference setting.
Provide a critical analysis of the ethical and safety aspects related to the clinical application of Bayesian inference, that is, supporting the clinical decision-making process.
Other innovative artificial intelligence and machine learning methods will also be explored to identify and predict those patients who have worse outcomes.
Ethics and dissemination
The study design, protocol and patient-facing documentation have been approved by North West Research Ethics Committee (REC 17/NW/0632) as an amendment to the NIHR Southampton CRF-managed SRB. No patient identifiable data are included in this paper. Ethics approval for the study was obtained from HRA Specific Review Board (REC 20/HRA/2986) for waiver of informed consent for the database-only cohort. The database will be conducted in accordance with the principles of good clinical practice. Database documents (paper and electronic) will be collected and retained in accordance with the General Data Protection Regulations 2018 in a secure location during and after the trial has finished. All essential documents including source documents will be retained for a minimum period of 5 years following the end of the study. Clinical and research data collected in the database will be kept in a highly secure contemporary encrypted data platform BC|INSIGHT that was set up in a Microsoft Data Centre in South UK. Access to the study’s data is only given to study team members that have signed the delegation logs. Data access is via an encrypted web service with username/password-type authentication. The data held in the server are backed up weekly using the snapshot technology provided by Microsoft Azure, and the retention period of the snapshot storage is 2 years. Storage of data for research purposes will be in an anonymised format. Anonymised data may be released to individuals or organisations outside the UK, following approval by the REACT COVID-19 data access committee. A REACT COVID-19 Data Access Management Committee will be established to prioritise and ensure appropriate governance of requests to access linked anonymised clinical data. REACT Committee membership will include a chief investigator and principal investigators, the scientific lead for REACT and a representative from UHSFT Research and Development and/or Academic Health Sciences Centre prioritisation committee.
Written informed consent will be obtained from all study participants who agree to biobanking of excess clinical sample±additional sampling, as per approvals for amendment to the SRB reviewed by North West Research Ethics Committee (REC 17/NW/0632). For patients who have impaired cognition, informed consent will be sought from their legally acceptable representative, with retrospective consent sought for those who regain capacity to consent following recovery from acute illness. The findings from this study will be disseminated locally and internationally through manuscript publications in peer-reviewed journals.
SARS-CoV-2 infection is presenting a heterogenous pattern of disease, with varying presentations from asymptomatic carriage, as demonstrated among healthcare workers7 to severe disease with mortality rates of 50% in patients requiring intensive care.8 Emerging understanding is identifying some comorbidities and risk factors for progression to severe disease, but there is still a wide breadth of severity among patients displaying the same risk profile.
The longitudinal, mainly automated data collection of the REACT COVID-19 study enables a granularity of data that has potential to identify early markers of disease severity and the relevance of their trajectory of change on disease progression. The REACT COVID-19 database can be used to rapidly and easily identify participants who may be appropriate for clinical trials and offers visual interpretation of large amounts of data for hypothesis generation.
The database provides a rich dataset to support the biobanking subcohort, within which there is the potential to recruit to smaller, focused basic science studies investigating the mechanisms driving disease processes resulting from infection with SARS-CoV-2.
The REACT COVID-19 study is mindful of the national and international efforts to bring together data from multiple centres in order to enhance the power of studies, be they clinical, observation, mechanistic or intervention studies. The study will therefore be aligning itself with the broader UHS and UoS research project of Enabling new treatment approaches for COVID-19 Treatment, supported by the Southampton Coronavirus Support Fund, NIHR Southampton BRC and NIHR Southampton CRF.
The ability of the REACT COVID-19 study to capture the granularity of the longitudinal path of clinical care is both a major strength and a potential limitation. There is no strict study ‘visit’ protocol to follow, with data capture occurring in a more fluid manner. Therefore, there is the potential for a greater number of incomplete datasets, with some patients having a greater amount of available data. This is likely to be reflective of the level of care required, such as in those patients who required intensive care support. Any analysis will allow this, and populations may be subcohorted to answer specific questions if the data are only available for a specific group.
The simplicity of data visualisation within the REACT COVID-19 study belies the complexity of the concepts behind this project, facilitated through the unique collaboration between UHSFT, the UoS and the digital ECMT. This collaboration possesses the requisite expertise in their respective specialisms to deliver all aspects of this study, alongside the infrastructure provided by the NIHR Southampton CRF and NIHR Southampton BRC. The REACT COVID-19 study enables visualisation of detailed clinical data and the application of novel artificial intelligence methods, which will allow greater understanding of the natural history of this novel disease through both longitudinal and cross-sectional analyses.
The authors acknowledge the National Institute for Health Research (NIHR) Southampton Clinical Research Facility (CRF) and University Hospital Southampton NHS Foundation Trust Research Nursing teams for their support in the set-up of this study, Dr Dave Stockley (National Institute for Health Research CRF Southampton Research Biorepository Manager) and Dr Alastair Watson for their help in proofreading and preparation of the manuscript for submission.
Contributors HB and AF designed the protocol and drafted the manuscript; AD and MC were involved in protocol design; JB, HP and FB were involved in the realisation of data extraction, integration, transformation and upload processes; CK, GT and SNF were involved in protocol design for the biobanking subcohort; NS and SW are involved in manual data collection processes; DL, PF, HP and JB were involved in the design and adaptation of the Real-time Analytics for Clinical Trials platform; and TW was involved in study conception and protocol design. All authors reviewed the final manuscript.
Funding The REACT platform has been supported by the digital Experimental Cancer Medicine Team free of charge. The biobanking subcohort is supported the NIHR Southampton CRF and NIHR Southampton Biomedical Research Centre at University Hospital Southampton NHS Foundation Trust and as part of a broader effort (Enabling New Treatment Approaches for COVID-19 Treatment) by the University of Southampton (UoS) charity (Office of Development and Alumni Relations). In addition the Clinical Informatics Research Unit, UoS has supported infrastructure costs. The support described above was not provided from a specific award or grant.
Competing interests None declared.
Patient and public involvement Patients and/or the public were involved in the design, conduct, reporting or dissemination plans of this research. Refer to the Methods and analysis section for further details.
Patient consent for publication Not required.
Data availability statement No data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.