Project20: maternity care mechanisms that improve access and engagement for women with social risk factors in the UK – a mixed-methods, realist evaluation

Objectives To evaluate how women access and engage with different models of maternity care, whether specialist models improve access and engagement for women with social risk factors, and if so, how? Design Realist evaluation. Setting Two UK maternity service providers. Participants Women accessing maternity services in 2019 (n=1020). Methods Prospective observational cohort with multinomial regression analysis to compare measures of access and engagement between models and place of antenatal care. Realist informed, longitudinal interviews with women accessing specialist models of care were analysed to identify mechanisms. Main outcome measures Measures of access and engagement, healthcare-seeking experiences. Results The number of social risk factors women were experiencing increased with deprivation score, with the most deprived more likely to receive a specialist model that provided continuity of care. Women attending hospital-based antenatal care were more likely to access maternity care late (risk ratio (RR) 2.51, 95% CI 1.33 to 4.70), less likely to have the recommended number of antenatal appointments (RR 0.61, 95% CI 0.38 to 0.99) and more likely to have over 15 appointments (RR 4.90, 95% CI 2.50 to 9.61) compared with community-based care. Women accessing standard care (RR 0.02, 95% CI 0.00 to 0.11) and black women (RR 0.02, 95% CI 0.00 to 0.11) were less likely to have appointments with a known healthcare professional compared with the specialist model. Qualitative data revealed mechanisms for improved access and engagement including self-referral, relational continuity with a small team of midwives, flexibility and situating services within deprived community settings. Conclusion Inequalities in access and engagement with maternity care appears to have been mitigated by the community-based specialist model that provided continuity of care. The findings enabled the refinement of a realist programme theory to inform those developing maternity services in line with current policy.

1. The sample used for quant analyses comprised 799 women accessing different models of care at 2 urban inner city hospitals. Sample size calculations appear to be based on equal group sizes of 250 women, yet the numbers accessing each model of care varied significantly. Were any calculations undertaken to assess study power for analyses involving different group sizes and/or account for clustering in the data (given that analyses include use of multinomial regression accounting for clusters?) And what implications do these factors have for inferences in relation to data presented?
2. Social characteristics of samples: Table 1 and Table 2 use different ways to present information regarding ethnicity and migration status. I may have missed it, but is there an explanation for the different approaches used? It would be useful to have the same information for both samples, and also for women of refugee/migrant background to know length of time since arrival in UK, and English language proficiency/requirement for interpreter for the women included in the quantitative data analysis. If these data are not available in routinely collected data, I would encourage the authors to comment on this in the discussion, as it is clearly an important consideration for service level planning with regard to meeting the needs of these communities.
3. Adjustment for social and medical risk factors in Model. 1: If I understand correctly, adjustment was made for ethnicity, age, parity, deprivation score, number of social risk factors, and medical risk status. Given the relatively small numbers in some comparison groups, is there a risk of overadjustment in taking this approach? 4. Medical risk status: Apologies if I missed this, but how was medical risk defined at booking, and at labour commencement? And for analyses of model of care accessed by deprivation score and risk factors (Table 3) and gestation at booking in relation to model of care (Table 4), why adjust for medical risk status at labour commencement (assuming many factors influencing this are not apparent at the time of booking)? 5. Domestic abuse and mental health: How were these factors ascertained by services providing care to women in the quantitative sample? Arguably, the numbers reflect under ascertainment of these common experiences, including within specialist service settings? It would be useful for the authors to comment on this in drawing inferences from the findings.
6. Relational quality and missed appointments: These are complex analyses, and especially in Table 10, the small numbers in cells, present challenges with regard to inference. In Table 10, leaving aside the small number of women who missed 4 or more appointments, the direction of effects consistently suggest that women in standard care were more likely to miss appointments compared with women in specialist models. While caution is needed due to the small numbers, a more Bayesian approach to inference may be warranted here.
7. Longitudinal interview study -I may have missed this information, but I could not find reference to aspects of the design that reflect a longitudinal approach, or how this was managed in analysis.
8. Programme theory -Given the cultural diversity of the interview sample, I was surprised that no reference was made to cultural safety or access to interpreters as mechanisms to improve access and engagement. It is unclear whether these issues were considered in scope, or outside the framework of the refined programme theory? 9. Another aspect of the programme theory that I found curious is the assumption in outcome 4 that continuity of care will lead to greater identification of social risk in women who have not previously disclosed issues of concern. One element of the quantitative findings that might suggest otherwise is the very low level of ascertainment of domestic violence (<5% prevalence) and mental health issues (6% overall) in the routinely collected data. It would be interesting to examine these data in light of the programme theory to explore whether there was higher ascertainment in specialist models, and at what stage of pregnancy issues were identified (ie is there evidence that relational continuity results in greater recognition/higher disclosure of social risk factors at later gestations of pregnancy?).
10. Public and patient involvement -How were service users recruited and supported to engage in planning and development of the research? What was the reason for limiting their role in interpretation of data to the qualitative findings? 11. Strengths and limitations: I would encourage the authors to provide a more detailed account of strengths and limitations. What do they see as the main strengths of the approach they have taken? With regard to limitations, other factors to consider include: limitations of variables available in routine data, lack of generalisability outside UK health care system, sample size considerations outlined above.
Minor issues: Table 1 (and several other tables) report p values of 0.000. Since probability cannot be zero, suggest report as p<0.001 Page 5, line 48. Typo in the sentence "One model based is", should this be "one model is based'?
Page 16, line 30 (and in other places) Data are plural so verb is 'were'

GENERAL COMMENTS
Dear Editor, I want to thank the authors for such a thoughtful and well-puttogether mixed methods realist study. I was particularly impressed by how much detail the authors put into obtaining the quantitative and qualitative pieces of information towards theory their realist/mechanism-based theory formulation.
Despite the above strengths of the paper. There is a major weakness. While the authors explicitly illustrated how information informing the theory formulation was contained, the process by which the theory was formulated is not explained. From the information from the qualitative and quantitative sources, I have no idea how the authors obtained their CMO configurations illustrated in Table 13. I think that is a major flaw in the methodology of the paper. Formulating theories in realist research is known as retroductive theorizing, but there was no mention of this in this otherwise wellthought-out paper.
The authors should consider these three resources to help them in this aspect: Thank you for inviting me to review this interesting and ambitious paper exploring the maternity care experiences of women with complex social risk factors. The authors are to be commended for the clarity of research questions, detailed explanation of quantitative findings, and the use of mixed methods to explore system level factors and mechanisms relating to access and engagement with care.
Notwithstanding these strengths, there are some aspects of the paper that warrant further consideration to contextualise inferences.
Thank you for your kind and encouraging feedback. Your detailed insights have been incredibly helpful in strengthening this manuscript. Please see revisions made below that we hope will contextualise the findings and inferences, and increase transparency of the limitations of this research.
1. The sample used for quant analyses comprised 799 women accessing different models of care at 2 urban inner city hospitals. Sample size calculations appear to be based on equal group sizes of 250 women, yet the numbers accessing each model of care varied significantly. Were any calculations undertaken to assess study power for analyses involving different group sizes and/or account for clustering in the data (given that analyses include use of Thank you for this important consideration. As the power calculations were undertaken prior to data access and prospective analysis there was an assumption (based on national targets for women to have access to continuity of care models) that the groups would be more evenly distributed. In reality, as you point out, the number of women allocated to different models of care varied significantly, with many outcomes being underpowered. This has been highlighted as a limitation of this research and prioritised as a future research recommendation ( We have also revised the discussion section, highlighting that the quantitative analysis should be reviewed with caution due to the potentially underpowered groups, and not without the insights of the qualitative data to infer mechanisms, in line with realist underpinnings 1 (pg28): The small and varied numbers in each quantitative data group should be taken into consideration due to the significant amount of multiple testing required to establish the separate effects of the model of care, place of care and service attended. This presents a potential limitation as the use of multiple testing can result in erroneous inferences, reducing the probability of detecting effects when they do exist 2 .
2. Social characteristics of samples: Table 1 and Table 2 use different ways to present information regarding ethnicity and migration status. I may have missed it, but is there an explanation for the different approaches used? It would be useful to have the same information for both samples, and also for women of refugee/migrant background to know length of time since arrival in UK, and English language proficiency/requirement for interpreter for the women included in the quantitative data analysis. If these data are not available in routinely collected data, I would encourage the authors to comment on this in the discussion, as it is clearly an important consideration for service level planning with regard to meeting the needs of these communities.
Thank you. The reason for the differences in how these are presented are that we were restricted to the quantitative data that is recorded in the routinely collected hospital data across the two different services. For the qualitative analysis we were able to collect more detailed demographics. Language proficiency has been added to the  (Table 3) and gestation at booking in relation to model of care (Table 4), why adjust for medical risk status at labour commencement (assuming many factors influencing this are not apparent at the time of booking)?
Revised-'Medical risk status' added to Table 1. When referring to medical risk status we have now directed the reader to supplementary file 1 'definitions' where medical risk status is recorded as high or low by the healthcare professional at the initial maternity booking appointment, and then again at the onset of labour (pg 9).
The decision to adjust for medical risk factors at both booking and onset of labour is to account for those pregnancies which began as low risk but over the course of the pregnancy became high risk that would change the appropriate number of appointments and level of engagement with services.

Domestic abuse and mental health:
How were these factors ascertained by services providing care to women in the quantitative sample? Arguably, the numbers reflect under ascertainment of these common experiences, including within specialist service settings? It would be useful for the authors to comment on this in drawing inferences from the findings.
Revised definitions in supplementary file 1: 'these risk factors could be ascertained through self-reporting or previously recorded data'. Added to discussion: (pg 26) 'Previous research has found a correlation between continuity models of care and increased referrals to support services 4, but it must be acknowledged that under ascertainment of sensitive issues such as mental health and domestic abuse remains likely whilst women perceive services as a form of surveillance and risk 5 .' and (pg28): 'The qualitative data revealed this continuity of care lessened anxiety, the need to repeat often complex social and medical histories to numerous professionals, and increased disclosure of social risk factors…' 6. Relational quality and missed appointments: These are complex analyses, and especially in Table 10, the small numbers in cells, present challenges with regard to inference. In Thank you for this point and suggestion of Bayesian analysis. This has prompted further reading has been put forward as a future research recommendation within the discussion (pg28): 'This could be overcome in future research using larger sample sizes, and Bayesian Table 10, leaving aside the small number of women who missed 4 or more appointments, the direction of effects consistently suggest that women in standard care were more likely to miss appointments compared with women in specialist models. While caution is needed due to the small numbers, a more Bayesian approach to inference may be warranted here.
analysis to test the apparent mitigating effects of the specialist models of care on inequalities in access and engagement 1 .' in line with similar advice such as 'The dominance of qualitative approaches in the field of realistic evaluation has been the starting point of several recent methodological papers promoting a more quantitative turn in realistic evaluation and in theory-based evaluation (e.g. Ford et al., 2018;Giffoni et al., 2018;Hawkins, 2016 Revised. Added to methods (pg 8): 'Semi-structured, longitudinal interviews with 20 women with low socioeconomic status and social risk factors who were receiving specialist care from one of the two service providers were carried out at approximately 28 and 36 weeks' gestation, and 6 weeks post birth', and (pg 9): 'The qualitative data were coded using NVivo v.12 and analysed using a thematic framework analysis 6 . This allowed for the organisation of a large qualitative dataset into a coding framework developed using previously constructed programme theories 7,8 , to uncover new theories and differences in women's experiences depending on their characteristics 6 .' and (pg 9):This method suited the longitudinal approach to data collection as changes in women's perceptions and relationships with healthcare providers could be seen over the course of their pregnancy and postnatal period.' 8. Programme theory -Given the cultural diversity of the interview sample, I was surprised that no reference was made to cultural safety or access to interpreters as mechanisms to improve access and engagement. It is unclear whether these issues were considered in scope, or outside the framework of the refined programme theory?
Yes this is an excellent point. As the issue of language barriers and use of interpreter services was such a significant issue for these women it was tested in a separate analysis of additional programme theories generated by a previous realist synthesis 9 . The refined programme theory is published here: https://equityhealthj.biomedcentral.com/articles/10.1186/s 12939-021-01570-8 We have revised the manuscript to clarify this in the discussion (pg 26): 'The wider Project20 evaluation previously tested programme theory relating to interpreter services for pregnant women with social risk factors, finding that despite accessing a specialist model of care women experienced a lack of regulation and access to high-quality interpretation services 10 .' 9. Another aspect of the programme theory that I found curious is the assumption in outcome 4 that continuity of care will lead to greater identification of social risk in women who have not previously disclosed issues of concern.
One element of the quantitative findings that might suggest otherwise is the very low level of ascertainment of domestic violence (<5% prevalence) and mental health issues (6% overall) in the routinely collected data. It would be interesting to examine these data in light of the programme theory to explore whether there was higher ascertainment in specialist models, and at what stage of pregnancy issues were identified (ie is there evidence that relational continuity results in greater recognition/higher disclosure of social risk factors at later gestations of pregnancy?).
Apologies, this needs clarifying. The findings summarised in the refined programme theory (table 13) refer to where care is situated and who it reaches rather than increased disclosure. So if services are situated in areas of deprivation, according to the quantitative findings that increased deprivation score correlates to increased social risk factors, services situated in deprived areas will be more likely to provide an enhanced level of care to women with this increased risk whether the women have disclosed their issues or not. This supports an 4 earlier research finding that women under a continuity model of care go on to receive more referrals to support services. The wider evaluation tests the theory that women then go on to feel more able to disclose sensitive issues, and receive an enhanced level of specialist support. Since this review process this paper has now been published in Women and Birth 5 . This has now been clarified in this manuscript's discussion (pg 26): Previous research has found a correlation between continuity models of care and increased disclosure and referral to support services 4 5, but it must be acknowledged that under ascertainment of sensitive issues such as mental health and domestic abuse remains likely whilst women perceive services as a form of surveillance and risk 5 .
10. Public and patient involvement -How were service users recruited and supported to engage in planning and development of the research? What was the reason for limiting their role in interpretation of data to the qualitative findings?
Revised (pg 7): Multiple representative, diverse groups of service users were involved in the planning and development of this research. They were recruited through local community groups, clinicians, and existing patient involvement groups. Using participatory appraisal methods and online engagement events, recent maternity service users provided feedback on the protocol, study materials, interview guides, and refinement of programme theories. They also prioritised outcome measures and reviewed the qualitative data analysis. Training needs were identified by the service users for analysis of quantitative data and further research addressing maternal health inequities 11 .
11. Strengths and limitations: I would encourage the authors to provide a more detailed account of strengths and limitations. What do they see as the main strengths of the approach they have taken? With regard to limitations, other factors to consider include: limitations of variables available in routine data, lack of generalisability outside UK health care system, sample size considerations outlined above.
Revised extensively in both the discussion (pg 28) and summary of strengths and limitations of this research (pg 4 after abstract) with particular reference to the sample size and future Bayesian analysis recommendations: • I want to thank the authors for such a thoughtful and well-put-together mixed methods realist study. I was particularly impressed by how much detail the authors put into obtaining the quantitative and qualitative pieces of information towards theory their realist/mechanism-based theory formulation.
Thank you for your time reviewing this manuscript and very helpful and encouraging feedback. Please see revisions made below.
Despite the above strengths of the paper. There is a major weakness. While the authors explicitly illustrated how information informing the theory formulation was contained, the process by which the theory was formulated is not explained. From the information from the qualitative and quantitative sources, I have no idea how the authors obtained their CMO configurations illustrated in Table 13. I think that is a major flaw in the methodology of the paper. Formulating theories in realist research is known as retroductive theorizing, but there was no mention of this in this otherwise well-thought-out paper.
The authors should consider these three resources to help them in this aspect: Thank you for noting this and for the very useful resources that have been referenced appropriately in the manuscript. The initial programme theories were developed using retroductive theorising in a previous realist synthesis 9 and focus groups with midwives 12 before being tested in this aspect of the research. We have revised supplementary file 2 to clearly show the process of theory refinement between the initial programme theories and the CMO configuration described in table 13. We have also revised the methods section and included figure 1 to visually demonstrate the process of refinement that is then detailed in supplementary file 2. See pg 6: The aims of this study were approached through the testing and refinement of initial programme theories constructed in an earlier synthesis of literature 7 and focus groups with midwives 8 relating to how women with social risk factors access and engage with maternity care-see figure 1 for the theory refinement process. Retroductive theorizing was used to uncover meaningful causal mechanisms, often focusing on how the wider context and human response to different aspects of maternity care leads to specific outcomes 13 . This approach offers an epistemologically and ontologically grounded way of integrating mixed methods, often analysing qualitative data to find the causal relationship behind quantitative