Does an innovative paper-based health information system (PHISICC) improve data quality and use in primary healthcare? Protocol of a multicountry, cluster randomised controlled trial in sub-Saharan African rural settings

Introduction Front-line health workers in remote health facilities are the first contact of the formal health sector and are confronted with life-saving decisions. Health information systems (HIS) support the collection and use of health related data. However, HIS focus on reporting and are unfit to support decisions. Since data tools are paper-based in most primary healthcare settings, we have produced an innovative Paper-based Health Information System in Comprehensive Care (PHISICC) using a human-centred design approach. We are carrying out a cluster randomised controlled trial in three African countries to assess the effects of PHISICC compared with the current systems. Methods and analysis Study areas are in rural zones of Côte d’Ivoire, Mozambique and Nigeria. Seventy health facilities in each country have been randomly allocated to using PHISICC tools or to continuing to use the regular HIS tools. We have randomly selected households in the catchment areas of each health facility to collect outcomes’ data (household surveys have been carried out in two of the three countries and the end-line data collection is planned for mid-2021). Primary outcomes include data quality and use, coverage of health services and health workers satisfaction; secondary outcomes are additional data quality and use parameters, childhood mortality and additional health workers and clients experience with the system. Just prior to the implementation of the trial, we had to relocate the study site in Mozambique due to unforeseen logistical issues. The effects of the intervention will be estimated using regression models and accounting for clustering using random effects. Ethics and dissemination Ethics committees in Côte d’Ivoire, Mozambique and Nigeria approved the trials. We plan to disseminate our findings, data and research materials among researchers and policy-makers. We aim at having our findings included in systematic reviews on health systems interventions and future guidance development on HIS. Trial registration number PACTR201904664660639; Pre-results.


Introduction
(1) The authors reported that the project started in 2015 and that (a) a systematic review and a framework synthesis have been produced, and (b) studies that characterized existing HIS in the 3 countries. Nowhere in the manuscript has the authors provided these contexts prior to the current CRCT, which is essential.
(2) It will be useful if the authors can provide a figure encompassing the qualitative, quantitative, structure, process and outcome elements of the whole project.

Methods
(1) The authors first indicated no patient involvement in the research. However, in the data collection section, they included a patient's satisfaction assessment. Clarification is required.
(2) The process of co-creation of the intervention among frontline health workers was little described. What happened during and after workshops, personal feedback and piloting under real living conditions? How much time was spent? Who were involved? How did the researchers and health workers arrive at the final version?
(3) How different is the new intervention from the existing tool? It will be useful if the authors provide a summary of what has been added to the new tool to provide a strong rationale for the change.
(4) How did the researchers treat the heterogeneity of the three countries in terms of intervention design, health systems, the status of health workers in the health system and the scope of their service, data analysis and interpretation?

REVIEWER
McConnachie, Alex University of Glasgow, Robertson Centre for Biostatistics REVIEW RETURNED 02-May-2021

GENERAL COMMENTS
This review considers the paper by Bosch-Capblanch and colleagues, describing the design of a multinational cluster randomised trial of a paper-based health information system. This review focuses mainly on the statistical elements of the paper.
I thought the abstract was fine, but the methods section reads more like how the trial was intended to be carried out, rather than what has actually happened. The abstract could perhaps recognise that there have been some difficulties implementing the trial as originally planned.
I think the description of the study outcomes could be better. For example, the vaccination outcome reads as if it applies to the entire population of each health facility, whereas it is based on a surveys of households at baseline and follow-up. This is a little clearer in Table 1, but could be clearer in the text.
In terms of the outcomes themselves, are the authors confident that they can be measured equally well, and in the same way, in the intervention and control HFs? The paper describes the data collection teams as being blind to the randomisation, which is good, but will they be able to stay blind when they start collecting some of the outcomes? Could the intervention actually improve some aspects of data collection (e.g. mortality data) and thereby make the outcomes for intervention HFs appear worse?
The sample size section was not very clear, but I recognise that it is a very difficult part of the paper to get right. Would it help if the R code used for simulations were to be made available in the supplementary materials? That way, at least someone could replicate what was done.
The authors state aiming for a Type 1 error rate of 5%, but do not say whether this included adjustment for having five primary outcomes. Crudely speaking, each outcome would have to be analysed at 1% significance.
Also, the authors choose a value for k in their sample size calculations of 0.1, with reference to Hayes and Bennet, but I could not find any recommendation in that paper to match this assumption. The best I could find was a general statement that values are often no more than 0.25, and rarely more than 0.5.

VERSION 1 -AUTHOR RESPONSE
Reviewer: 1 Introduction (1) The authors reported that the project started in 2015 and that (a) a systematic review and a framework synthesis have been produced, and (b) studies that characterized existing HIS in the 3 countries. Nowhere in the manuscript has the authors provided these contexts prior to the current CRCT, which is essential.
We understand that the reviewer asks for a better narrative relating these research components. We have rephrased.
(2) It will be useful if the authors can provide a figure encompassing the qualitative, quantitative, structure, process and outcome elements of the whole project.

Methods
(1) The authors first indicated no patient involvement in the research. However, in the data collection section, they included a patient's satisfaction assessment. Clarification is required.
Clarified in the section. Patients were not involved in the research. We did approach community members, though, in the assessment of the outcomes.
(2) The process of co-creation of the intervention among frontline health workers was little described. What happened during and after workshops, personal feedback and piloting under real living conditions? How much time was spent? Who were involved? How did the researchers and health workers arrive at the final version?
We are very glad to read this comment, because we were being very synthetic here due to space concerns. We have given a better explanation now in the subsection "Intervention".
(3) How different is the new intervention from the existing tool? It will be useful if the authors provide a summary of what has been added to the new tool to provide a strong rationale for the change.
See comment just above.
(4) How did the researchers treat the heterogeneity of the three countries in terms of intervention design, health systems, the status of health workers in the health system and the scope of their service, data analysis and interpretation?
An explanation has been added into the text.
Reviewer: 2 I thought the abstract was fine, but the methods section reads more like how the trial was intended to be carried out, rather than what has actually happened. The abstract could perhaps recognise that there have been some difficulties implementing the trial as originally planned.
We have tried to be more explicit by adding some statements and deleting some terms in order to respect the abstract words limit.
I think the description of the study outcomes could be better. For example, the vaccination outcome reads as if it applies to the entire population of each health facility, whereas it is based on a surveys of households at baseline and follow-up. This is a little clearer in Table 1, but could be clearer in the text.
We have added detail, both in the narrative and in Table 1.
In terms of the outcomes themselves, are the authors confident that they can be measured equally well, and in the same way, in the intervention and control HFs? The paper describes the data collection teams as being blind to This is really a good point that we have really discussed internally a lot. Clarification added after the list of secondary outcomes. the randomisation, which is good, but will they be able to stay blind when they start collecting some of the outcomes? Could the intervention actually improve some aspects of data collection (e.g. mortality data) and thereby make the outcomes for intervention HFs appear worse? The sample size section was not very clear, but I recognise that it is a very difficult part of the paper to get right. Would it help if the R code used for simulations were to be made available in the supplementary materials? That way, at least someone could replicate what was done.
We have edited the sample size section for clarity. The simulation code is included as supplementary information.
The authors state aiming for a Type 1 error rate of 5%, but do not say whether this included adjustment for having five primary outcomes. Crudely speaking, each outcome would have to be analysed at 1% significance.
We limited the study to a small number of primary outcomes in which we were interested a priori, and do not plan to adjust the type 1 error rate. Also, the authors choose a value for k in their sample size calculations of 0.1, with reference to Hayes and Bennet, but I could not find any recommendation in that paper to match this assumption. The best I could find was a general statement that values are often no more than 0.25, and rarely more than 0.5.
Apologies, we should have written k=0.25 (we can reproduce the numbers with the code with k=0.25).
Given these two points, I do wonder whether the study could be underpowered. Is there any baseline data available that could inform the level of clustering of outcomes?
We have corrected the k value. We do not have data on the level of clustering since there is little information on health systems from the rural HFs in general. However, the areas are fairly homogenous.

REVIEWER
Tseng, Yu-hwei University of the Witwatersrand, Centre for Health Policy, School of Public Health, Faculty of Health Sciences REVIEW RETURNED 02-Jun-2021

GENERAL COMMENTS
The authors have addressed most of the questions I raised in the first review by adding the flowchart of the whole project and description of frontline health workers participation, and providing information about the new elements in the tool.
Two questions for the authors after their addition.
1. An important characteristic of the new tool is user's participation. The authors also emphasized the decision making aspect by frontline health workers in the design of the new tool. Can they elaborate how this is operationalized and measured? 2. Can the authors provide a systematic comparison of the old and new tools in order to highlight the value of your efforts?

REVIEWER
McConnachie, Alex University of Glasgow, Robertson Centre for Biostatistics REVIEW RETURNED 02-Jun-2021