Integration of academic and health education for the prevention of physical aggression and violence in young people: systematic review, narrative synthesis and intervention components analysis

Objectives To systematically review evidence on the effectiveness of interventions including integration of academic and health education for reducing physical aggression and violence, and describe the content of these interventions. Data sources Between November and December 2015, we searched 19 databases and 32 websites and consulted key experts in the field. We updated our search in February 2018. Eligibility criteria We included randomised trials of school-based interventions integrating academic and health education in students aged 4–18 and not targeted at health-related subpopulations (eg, learning or developmental difficulties). We included evaluations reporting a measure of interpersonal violence or aggression. Data extraction and analysis Data were extracted independently in duplicate, interventions were analysed to understand similarities and differences and outcomes were narratively synthesised by key stage (KS). Results We included 13 evaluations of 10 interventions reported in 20 papers. Interventions included either full or partial integration, incorporated a variety of domains beyond the classroom, and used literature, local development or linking of study skills and health promoting skills. Evidence was concentrated in KS2, with few evaluations in KS3 or KS4, and evaluations had few consistent effects; evaluations in KS3 and KS4 did not suggest effectiveness. Discussion Integration of academic and health education may be a promising approach, but more evidence is needed. Future research should consider the ‘lifecourse’ aspects of these interventions; that is, do they have a longitudinal effect? Evaluations did not shed light on the value of different approaches to integration.


Strengths and limitations of this study
• We used an exhaustive search including 19 databases and 32 websites.
• We used an innovative method to describe key components in this class of interventions.
• However, it was challenging to identify studies for inclusion.
• Meta-analysis was not possible because of the diversity of outcomes and raters.

INTRODUCTION
Violence among young people is a public-health priority due to its prevalence and harms to young people and wider society. 1 2 One UK study found that 10% of young people aged 11-12 reported carrying a weapon and 8% admitted attacking someone with intent to hurt them seriously. 3 By age 15-16, 24% of students reported they have carried a weapon and 19% reported attacking someone with the intention to hurt them seriously. 3 Early aggression and anti-social behaviour are strongly linked to adult violent behaviour. 4 5 School-based health education can be effective in reducing violence. [6][7][8] However, school-based health education is increasingly marginal in many high-income countries, partly because of schools increasing focus on attainment-based performance metrics. In England specifically, health education is not a statutory subject, [9][10][11] and school inspectors have a limited focus on how schools promote student health. 12 One way to avoid such marginalisation is to integrate health education into academic lessons. For example, health-related content can be reflected in academic lessons, by incorporating anti-violence messages within academic subjects, or by linking academic study skills with health promotion skills. This strategy may bring other benefits because: larger 'doses' may be delivered; students may be less resistant to health messages weaved into other subjects; and lessons in different subjects may reinforce each other. 13 14 Conversely, those teaching academic subjects may be uninterested or unqualified to teach health topics. Though theories of change in this class of interventions are diffuse, one important way in which they could be effective is by promoting developmental cascades involving the interplay of cognitive and non-cognitive skills. 15 16 Interventions integrating academic and health education could address violence by developing: social and emotional skills such as selfawareness, self-regulation, motivation, empathy and communication; 17 healthier social support or norms among students 15 18 19 ; knowledge of the costs 20   students' social norms about antisocial behaviours. 13 20 22-24 Despite policy interest in these interventions, they have not previously been the subject of a specific systematic review. Our focus on violence is informed by preliminary consultation, scoping work and logic model development suggesting that violence is an outcome especially amenable to these interventions. In the present review, we examined the characteristics of interventions that integrate academic and health education to prevent violence, and synthesised evidence for their effectiveness.

METHODS
This review was part of a larger evidence synthesis project on theories of change, process evaluations and outcome evaluations of integration of academic and health education for substance use and violence. We registered the protocol for this review on PROSPERO (CRD42015026464, https://www.crd.york.ac.uk/prospero/).

Inclusion and exclusion
Studies were included regardless of publication date or language. We included randomised controlled trials of interventions integrating academic and health education, the former defined as specific academic subjects or general study skills. We included schoolbased interventions that incorporated health education into academic lessons and interventions that included health education lessons with academic components.
For this review, we focus on violence outcomes, defined as the perpetration or victimisation of physical violence including convictions for violent crime. We included outcomes that were a composite of physical and non-physical (e.g. emotional) interpersonal violence, but excluded composite measures that included items not focused on interpersonal violence, such as damage to property. Interventions focusing on targeted health-related sub-populations (e.g. children with cognitive disabilities) were excluded as we were interested in universal interventions. We excluded interventions that trained teachers in classroom management.

Search strategy
We searched 19 databases and 32 websites, and contacted subject experts (see Online File 1 for full details).

Study selection
Pairs of researchers double-screened titles and abstracts in sets of 50 references until 90% agreement was reached. Subsequently, single reviewers screened each reference. We located the full texts of remaining references and undertook similar pairwise calibration followed by single screening.
Using an existing tool 25 we extracted data independently in duplicate from included studies and assessed trials for risk of bias using a modified version of the Cochrane assessment tool. 26 We undertook an intervention components analysis. 27 This was undertaken inductively by one researcher and audited by two other researchers, and used intervention descriptions to draw out similarities and differences in intervention design using an iterative method. Finally, we synthesised outcomes narratively due to the heterogeneity in included outcome measurement. We categorised the timing of intervention effect by period of schooling, defined in terms of English schools' key-stage (KS) system. KS1 includes school years 1-2 (age 5-7 years); KS2 includes years 3-6 (age 7-11 years); KS3 includes years 7-9 (age 11-14 years); KS4 includes years 10-11 (age 14-16 years); and KS5 includes years 12-13 (age 16-18 years).
We could not formally assess publication bias because heterogeneity in outcome measurement precluded meta-analysis.

RESULTS
We found and screened 76,979 references, of which we retained 702 for full-text screening and were able to assess 690. Of 62 relevant reports included in the overall project, 10 evaluations of eight interventions were reported in 14 papers that considered violence and are reported in this review ( Figure 1).

Included studies and their quality
All trials randomised schools except the Bullying Literature Project, which randomised classrooms ( Table 1). All evaluations were conducted in the USA, except for Gatehouse, 28 which was an Australian study. All control arms consisted of education-as-usual or waitlist controls.
Interventions were diverse and are summarised below in the intervention components analysis. Only two interventions (Bullying Literature Project, 29 Youth Matters 30 ) were wholly delivered by external staff. Several (Gatehouse, 28 Positive Action, 31 Steps to Respect 32 ) linked classroom-based delivery to school-level work to support and reinforce implementation. PATHS 33 and 4Rs 19 also emphasised teachers' professional development.
Evaluation quality varied (Table 2). Appraisal was hampered by poor reporting of some aspects of trial methods. Only three studies reported evidence of low risk of bias for random generation of allocation sequence; the remainder were unclear. No evaluations reported information on concealed allocation. In LIFT, 34 outcome assessors were blinded, resulting in low risk of bias in this domain, but all other interventions were of unclear risk of bias. All interventions included reasonably complete outcome data, and in only one evaluation did unit of analysis issues pose a risk of bias. In some studies such as Steps to Respect, follow-up was shorter than intervention length.

Intervention components analysis
This identified four themes describing included interventions: approach to integration, position of integration, degree of integration and point of integration. Included interventions are described in Table 1, and the components analysis is summarised in Table 3.

Approach to integration
Interventions approached the rationale for and strategy of integration in different and overlapping ways. Several (4Rs, Bullying Literature Project, Steps to Respect, Youth Matters) focused on literature as a focus for integration, using children's books as a prompt for social-emotional learning. Another approach to integration emphasised local development, where teachers were encouraged to decide the most appropriate way to integrate activities into daily instruction (PATHS, Positive Action). A third approach was linking to developmental concerns, emphasising not so much the comprehensive integration of academic and health education but rather the interrelationships between academic success and broader development, health and wellbeing (Positive Action, Gatehouse). These interventions viewed academic education through a 'health' lens, in addition to viewing health education through an 'academic' lens.  In some interventions, health education was fully integrated (woven seamlessly) into everyday academic lessons (Gatehouse, 4Rs, Youth Matters), while in partially integrated interventions, health education involved distinct lessons, albeit also covering academic learning (Positive Action).

Timing of integration
Most interventions were multi-year, though two involved only one school year (LIFT, Bullying Literature Project).

Intervention effects
Perpetration measures included bullying (physical, or physical/verbal), aggression against peers and others, and violent behaviours including injuring others (Table 4). Measures involved different raters, including students, teachers and observers. Victimisation measures ranged from physical violence specifically to interpersonal aggression more generally.
Heterogeneity of definition, measurement and form of effect sizes precluded meta-analysis.

Violence perpetration: KS2
Across the nine evaluations reporting outcomes in this KS, effects were inconsistent, including within studies by rater.
In LIFT, 34 effects at the end of the first intervention year on observed physical aggression in the playground were similar for students with different levels of baseline aggression(d=-0.14 at mean, 1 SD and 2 SD above the pre-intervention mean); these findings being described as 'statistically significant'. However, after the first intervention year of 4Rs, 19 there were no effects on teacher-reported aggression (b=0.02, SE=0.05, based on a 1-4 scale). After the second intervention year, 15 there were effects on teacher-reported student aggression (d=-0. 21

Violence perpetration: KS3
The two evaluations examining violence perpetration outcomes in KS3 had dissimilar results. At the end of the sixth intervention year of Positive Action Chicago, 39 students receiving the intervention reported lower counts of violence-related behaviours than notreatment controls (IRR=0.38, 95% CI [0.18, 0.81]; equivalent to d=-0.54). Students also reported fewer bullying behaviours (d=-0. 39), and parents reported that their children engaged in fewer bullying behaviours (d=-0.31). Significance values for these estimates were not presented, but both were supported by significant condition-by-time interactions in multilevel models, indicating that the intervention group showed an improved trajectory over time as compared to the control group. In contrast, after the third year from baseline in Youth Matters, 38 proportions of students were not different in the collective bully and bully-victim groups (both groups 16%; IG n=283, CG n=289).

Violence victimisation: KS2
While the five evaluations reporting outcomes in this KS were similar in follow-up period, they did not point to a clear effect. Students receiving the Bullying Literature Project  15). 36 The first trial included playground observation at the end of the first intervention year, which was suggestive of lower levels in bullying victimisation, though these differences were marginally non-significant (0.9, 0.82 vs. 1.01, 0.83; F(72.4)=3.74, p<0.10). 32 Finally, Youth Matters examined bullying victimisation through continuous and dichotomous measures. At the end of the second intervention year, the difference in log-transformed continuous scores suggested a decrease (difference=-0.171, SE=0.083, p=0.049), as did the difference in dichotomous scores (OR=0.61, p=0.098). 30 However, the latent class analysis did not suggest a difference between groups at this point. 38
In Youth Matters, differences in the log-transformed scores for bullying victimisation suggested a decrease in victimisation in intervention recipients as compared to controls, but this difference was not significant (difference=-0.123, SE=0.068, p=0.08). 40 However, at the end of the third intervention year, fewer students in the intervention than control group were F o r p e e r r e v i e w o n l y members of the victim or bully-victim classes (36%, n=283 vs 45%, n=289). 38 Based on our own chi-square test, this difference was significant (p=0.029). Gatehouse, 28

DISCUSSION
While the integration of academic and health education remains a promising model for the delivery of school-based health education, randomised evaluations were variable in quality and did not consistently report evidence of effectiveness in reducing violence victimisation or perpetration. Evidence was concentrated in KS2, with few evaluations in KS3 or KS4.
Though a formal moderator analysis was not possible, certain intervention models appear more effective than others. Specifically, evaluations of Positive Action in both Chicago 39 and Hawaii 37 showed consistently positive results across diverse measures. This may reflect the involvement of the intervention developer, a factor often associated with improved intervention fidelity (although Positive Action was not unique in this respect among interventions included in our review). It may also reflect that Positive Action included classroom, whole-school and external domain strategies delivered over multiple school years.
Though Gatehouse 28 was similar to Positive Action in its focus on multiple systems, Gatehouse targeted adolescents, whereas Positive Action was delivered from KS2 and also included work with parents. Another possible explanation for our results is that effects for these interventions may take time to emerge. This is plausible given the developmental focus of many of these interventions, and evidence of links between early aggressive behaviour and later violence. 4 5 For example, there was some evidence that effects on aggressive behaviour in 4Rs began to emerge after the second intervention year. 19 While findings were somewhat F o r p e e r r e v i e w o n l y contradictory across different outcomes for PATHS, there was some evidence that teachers of intervention students reported less aggression in later years of the intervention. 33 This systematic review has strengths and limitations. Identifying relevant studies was challenging often because of poor intervention description. We were unable to undertake meta-analysis or assessment of publication bias, though the preponderance of null results suggests that projects with non-significant findings are being published. Finally, the diversity of outcome measures and of raters precludes a complete and consistent picture of the effectiveness of these interventions via standardised measures. This is especially important as 'core outcome sets' become relevant in planning evaluations in public health and social science. Most studies focused on bullying, while evaluations of Positive Action 37 39 generally provided the most direct test of violent behaviours specifically.
Future research should seek to understand better the life course aspects of these interventions: that is, how does early school-based intervention impact later-life violent behaviours? From a policy perspective, it is clear that the integration of academic and health education, while possibly an effective intervention, will need to be considered alongside interventions involving other systems to prevent violence. Future evaluations will also contribute by considering the effects of integration in a diversity of ways and mechanisms of action for integration in different types of academic education. For example, contrasts between full and partial integration, which included evaluations did not address, could inform an understanding of how much integration is necessary to support health education messages.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60 F o r p e e r r e v i e w o n l y Through teaching a curriculum (including integration of cognitive behavioural principles in English classes) and establishing a school-wide adolescent health team, Gatehouse aims to: build a sense of security and trust in students; enhance skills and opportunities for good communication; and build a sense of positive regard through participation in school life. The intervention was delivered by teachers, supported by the schoolwide adolescent health team and by external consultants who themselves were experiences teachers. Integration was achieved by using English classes to convey cognitive behavioural techniques for self-management, including via a 'critical literacy' approach that uses poetry, literature, Students enrolled in year 6 and followed up over seven years 49% female, 51% male 86% White, 14% ethnic minority 12% mother less than high school graduate, 8% father less than high school graduate; 36% mother unemployed, 10% father unemployed; 22% single-parent families; 18% receiving benefits; 20% less than $15,000/year in early 1990s

Education as usual
Classroom instruction and discussion on specific social and problem-solving skills followed by skills practice, reinforced during free play using a group cooperation game with review of behaviour and presentation of daily rewards. There is also a parent evening to engage families and opportunities for parents to engage with teachers. The intervention was delivered by teachers and special instructors. Integration was achieved by teaching study skills alongside social-emotional education content.

Positive Action Chicago
Chicago, USA Li 2011 31 Lewis 2013 39 7 schools, ~240 students (IG); 7 schools, ~260 students (CG) Students enrolled in year 4 and followed up over six years ~48% female, ~52% male 55% African American, 32% Hispanic, 9% White non-Hispanic, 4% Asian, 5% other or mixed 83% receiving free lunch Teachers provide lessons covering six units: self-concept; positive actions for mind and body; positive socialemotional actions; managing oneself; being honest with oneself; and continually improving oneself. Content includes 140 lessons per grade per year from years 1 to 13. In addition, an implementation coordinator and school climate team are appointed to support the intervention. The intervention is primarily delivered by teachers and school staff; in both trials, this was supported by extensive professional development and training. Integration was achieved by linking academic learning to social-emotional and health-related learning, e.g. by including content on problem solving and study skills alongside positive actions for mind and body, and by encouraging teachers to reflect Positive Action content in academic lessons. An intervention to reduce conflict by improving students' social-emotional and thinking skills through a curriculum (including study skills), the establishment of a positive classroom environment and generalised positive social norms throughout the school environment. Lessons are grouped into three units addressing readiness and selfcontrol, feelings and relationships, and interpersonal problem solving. These units cover five domains: 1.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47 o n l y  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47 o n l y Table 3. Key themes in the intervention components analysis.

Key theme
Components within theme Approach to integration Literature: did interventions use literature and language arts as the key vehicle for delivery?
Local development: did interventions support teachers to link health education across academic subjects in each school? Linking to developmental concerns: did interventions link academic education and personal health and development? Domains of integration Classroom: did interventions focus on the classroom?

4
Information sources 7 Describe all information sources (e.g., databases with dates of coverage, contact with study authors to identify additional studies) in the search and date last searched.
Online File 1 Search 8 Present full electronic search strategy for at least one database, including any limits used, such that it could be repeated.

Online File 1
Study selection 9 State the process for selecting studies (i.e., screening, eligibility, included in systematic review, and, if applicable, included in the meta-analysis).

5
Data collection process 10 Describe method of data extraction from reports (e.g., piloted forms, independently, in duplicate) and any processes for obtaining and confirming data from investigators.

5
Data items 11 List and define all variables for which data were sought (e.g., PICOS, funding sources) and any assumptions and simplifications made.
Online File 1

Risk of bias in individual studies
12 Describe methods used for assessing risk of bias of individual studies (including specification of whether this was done at the study or outcome level), and how this information is to be used in any data synthesis. Risk of bias across studies 15 Specify any assessment of risk of bias that may affect the cumulative evidence (e.g., publication bias, selective reporting within studies).

5
Additional analyses 16 Describe methods of additional analyses (e.g., sensitivity or subgroup analyses, meta-regression), if done, indicating which were pre-specified.

RESULTS
Study selection 17 Give numbers of studies screened, assessed for eligibility, and included in the review, with reasons for exclusions at each stage, ideally with a flow diagram.
6, Figure 1 Study characteristics 18 For each study, present characteristics for which data were extracted (e.g., study size, PICOS, follow-up period) and provide the citations. 6-7, Table  1 Risk of bias within studies 19 Present data on risk of bias of each study and, if available, any outcome level assessment (see item 12). 6, Table 2 Results of individual studies 20 For all outcomes considered (benefits or harms), present, for each study: (a) simple summary data for each intervention group (b) effect estimates and confidence intervals, ideally with a forest plot.

DISCUSSION
Summary of evidence 24 Summarize the main findings including the strength of evidence for each main outcome; consider their relevance to key groups (e.g., healthcare providers, users, and policy makers).

12
Limitations 25 Discuss limitations at study and outcome level (e.g., risk of bias), and at review-level (e.g., incomplete retrieval of identified research, reporting bias).

12-13
Conclusions 26 Provide a general interpretation of the results in the context of other evidence, and implications for future research. 14 FUNDING Funding 27 Describe sources of funding for the systematic review and other support (e.g., supply of data); role of funders for the systematic review.  • We used an innovative method to describe key components in this class of interventions.
• However, it was challenging to identify studies for inclusion.
• Meta-analysis was not possible because of the diversity of outcomes and raters. Early aggression and anti-social behaviour are strongly linked to adult violent behaviour. [4,5] School-based health education can be effective in reducing violence. [6][7][8] However, school-based health education is increasingly marginal in many high-income countries, partly because of schools increasing focus on attainment-based performance metrics. In England specifically, health education is not a statutory subject, [9][10][11] and school inspectors have a limited focus on how schools promote student health. [12] One way to avoid such marginalisation is to integrate health education into academic lessons. For example, health-related content can be seamlessly integrated into existing academic lessons or discrete additional health education lessons can also include academic learning elements. This strategy may bring other benefits because: larger 'doses' may be delivered; students may be less resistant to health messages weaved into other subjects; and lessons in different subjects may reinforce each other. [13,14] Conversely, those teaching academic subjects may be uninterested or unqualified to teach health topics. Though theories of change in this class of interventions are diffuse, one important way in which they could be effective is by promoting developmental cascades involving the interplay of cognitive and non-cognitive skills. [15,16] Interventions integrating academic and health education could address violence by developing: social and emotional skills such as self-awareness, selfregulation, motivation, empathy and communication; [17] healthier social support or norms  [15,18,19]; knowledge of the costs [20] and consequences [21] of substance use; media literacy skills to critique harmful media messages; and modifying students' social norms about antisocial behaviours. [13,20,[22][23][24] Our work synthesising the theories of change underlying these interventions (Tancred et al., in press) identified that interventions aimed to integrate and thus erode boundaries between academic and health education, between students and teachers (so that relationships were improved and teachers might function more effectively as behavioural role models), and between classrooms and schools and schools and families (so that violence prevention messages communicated in classrooms might be reinforced by messaging in other settings).
Despite policy interest in these interventions, they have not previously been the subject of a specific systematic review. Previous systematic reviews have focused on socioemotional learning interventions or school-based interventions generally, [6][7][8] without considering interventions that specifically integrate with academic lessons as defined above.
Our focus on violence is informed by preliminary consultation, scoping work and logic model development suggesting that violence is an outcome especially amenable to these interventions. In the present review, we examined the characteristics of interventions that integrate academic and health education to prevent violence, and synthesised evidence for their effectiveness. That is, our research questions were: what are the overarching features relevant to integration of interventions that integrate academic and health education, and are these interventions effective at different key stages in reducing physical aggression and violence?

METHODS
This review was part of a larger evidence synthesis project on theories of change, process evaluations and outcome evaluations of integration of academic and health education

Inclusion and exclusion
Studies were included regardless of publication date or language. We included randomised controlled trials of interventions integrating academic and health education, the former defined as specific academic subjects or general study skills. We defined as 'health education' education seeking to improve the health and wellbeing of students (including social and emotional learning and other forms of violence prevention). We included schoolbased interventions that seamlessly incorporated health education into existing academic lessons and interventions that provided discrete health education lessons with additional academic components. Interventions could be delivered by teachers or other school staff such as teaching assistants, but may also have been delivered by external providers, for example from the health, voluntary or youth service sectors. We did not include interventions solely addressing social conduct in the classroom; relationships with peers or staff; attitudes to education, school or teachers; or aspirations and life goals. Our also definition excluded interventions which: were delivered in mainstream subject lessons but did not aim to integrate health and academic education; trained teachers in classroom management without student curriculum components; or were delivered exclusively outside of classrooms, as these did not seek to integrate academic and health education.
For this review, we focus on violence outcomes, defined as the perpetration or victimisation of physical violence including convictions for violent crime. We included outcomes that were a composite of physical and non-physical (e.g. emotional) interpersonal violence, but excluded composite measures that included items not focused on interpersonal violence, such as damage to property. Interventions focusing on targeted health-related sub-populations (e.g. children with cognitive disabilities) were excluded as we were interested in universal interventions. We excluded interventions that trained teachers in classroom management.

Search strategy
In our original search, undertaken between November and December 2015, we searched 19 databases and 32 websites, and contacted subject experts (see Online File 1 for full details). We subsequently updated our search in February 2018 using PsycINFO and CENTRAL, as all of our original study hits were recovered from these databases.

Study selection
Pairs of researchers double-screened titles and abstracts in sets of 50 references until 90% agreement was reached, with disagreements discussed at every stage. Subsequently, single reviewers screened each reference. We located the full texts of remaining references and undertook similar pairwise calibration with disagreements discussed, followed by single screening. Reports were translated into English where necessary. Using an existing tool [25] we extracted data independently in duplicate from included studies and assessed trials for risk of bias using a modified version of the Cochrane assessment tool. [26] Authors were contacted where study data were missing.

Synthesis methods
We undertook an intervention components analysis. [27] This was undertaken inductively by one researcher and audited by two other researchers, and used intervention descriptions to draw out similarities and differences in intervention design using an iterative method. Intervention descriptions were read and re-read and then coded manually. The goal of this analysis was to use a set of descriptors to characterise aspects of the integration of academic and health education in the intervention. Intervention descriptions were rarely detailed enough to permit 'deep' engagement with the specific content of the interventions to the heterogeneity in included outcome measurement. We categorised the timing of intervention effect by period of schooling, defined in terms of English schools' key-stage (KS) system. KS1 includes school years 1-2 (age 5-7 years); KS2 includes years 3-6 (age 7-11 years); KS3 includes years 7-9 (age 11-14 years); KS4 includes years 10-11 (age 14-16 years); and KS5 includes years 12-13 (age 16-18 years).
We could not formally assess publication bias because heterogeneity in outcome measurement precluded meta-analysis.

Patient and public involvement
Because this review focused on public health interventions that were generally preventive in nature, patients were not involved per se. However, stakeholders were extensively consulted in the development of research questions and in assessing the implications of the findings. In addition, findings were disseminated via stakeholder events, and a series of one-to-one consultations took place to ensure the relevance and salience of study findings.

RESULTS
In our original search, we found and screened 76,979 references, of which we retained 702 for full-text screening and were able to assess 690. Of 62 relevant reports included in the overall project, 10 evaluations of eight interventions were reported in 14 papers that considered violence and are reported in this review. Our update search yielded 2,355 references, of which we retained 41 for full-text screening and included six papers reporting three evaluations ( Figure 1). This yielded a total of 13 evaluations reported in 20 papers.

Included studies and their quality
All trials randomised schools except the Bullying Literature Project, which randomised classrooms (Table 1). All evaluations were conducted in the USA, except for Gatehouse, [28] which was an Australian study. All control arms consisted of education-asusual or waitlist controls, though Second Step [29][30][31] offered a brief anti-bullying intervention with low takeup.
Interventions were diverse and are summarised below in the intervention components analysis. Only two interventions (Bullying Literature Project, [32] Youth Matters [33]) were wholly delivered by external staff. Several (Gatehouse, [28] Positive Action, [34] Steps to Respect [35]) linked classroom-based delivery to school-level work to support and reinforce implementation. PATHS [36] and 4Rs [19] also emphasised teachers' professional development.
Evaluation quality varied (Table 2). Appraisal was hampered by poor reporting of some aspects of trial methods. Only four studies reported evidence of low risk of bias for random generation of allocation sequence; the remainder were unclear. Only one study reported information on concealed allocation. In LIFT, [37] outcome assessors were blinded, resulting in low risk of bias in this domain, but all other interventions were of unclear risk of bias. All interventions included reasonably complete outcome data, and in only one evaluation did unit of analysis issues pose a risk of bias. In some studies such as Steps to Respect, follow-up was shorter than intervention length. Evaluations also differed in size, ranging in size from seven classrooms to 63 schools.

Intervention components analysis
This identified four themes describing included interventions: approach to integration, position of integration, degree of integration and point of integration. Included interventions are described in Table 1, and the components analysis is summarised in Table 3.  would also support academic attainment, whereas Positive Action tied together individual student attainment with student health and wellbeing.

Degree of integration
In some interventions, health education was fully integrated (woven seamlessly) into everyday academic lessons (Gatehouse, 4Rs, Youth Matters), while in partially integrated interventions, health education involved distinct lessons, albeit also covering academic learning (Positive Action).

Timing of integration
Most interventions were multi-year, though two involved only one school year (LIFT, Bullying Literature Project).

Intervention effects
Perpetration measures included bullying (physical, or physical/verbal), aggression against peers and others, and violent behaviours including injuring others. Measures involved different raters, including students, teachers and observers. Victimisation measures ranged from physical violence specifically to interpersonal aggression more generally. Heterogeneity of definition, measurement and form of effect sizes precluded meta-analysis. No included studies described effects for KS1 or KS5. Measures and corresponding effect estimates are included in Table 4. Across the 10 evaluations reporting outcomes in this KS, effects were inconsistent, including within studies by rater.
In LIFT, [37] effects at the end of the first intervention year on observed physical aggression in the playground were similar for students with different levels of baseline aggression(d=-0.14 at mean, 1 SD and 2 SD above the pre-intervention mean); these findings being described as 'statistically significant'. However, after the first intervention year of 4Rs, [19] there were no effects on teacher-reported aggression (regression-estimated b=0.02, Steps to Respect, evaluated in two different trials, also found no differences in studentreported bullying victimisation at the end of the first intervention year in the first (IG:  [42,46], Second Step [30,31] and Gatehouse [28]) and KS4 (Gatehouse [28]) suggested no evidence of effectiveness. In Youth Matters, differences in the log-transformed scores for bullying victimisation suggested a decrease in victimisation in intervention recipients as compared to controls, but this difference was not significant (regressionestimated difference=-0.123, SE=0.068, p=0.08). [46] However, at the end of the third intervention year, fewer students in the intervention than control group were members of the victim or bully-victim classes (36%, n=283 vs 45%, n=289

DISCUSSION
While the integration of academic and health education remains a promising model for the delivery of school-based health education, randomised evaluations were variable in quality and did not consistently report evidence of effectiveness in reducing violence victimisation or perpetration. Evidence was concentrated in KS2, with few evaluations in KS3 or KS4.
Few interventions showed consistent signals of effectiveness. Though a formal moderator analysis was not possible, certain intervention models appear more effective than  [43] and Hawaii [41] showed consistently positive results across diverse measures. This may reflect the involvement of the intervention developer, a factor often associated with improved intervention fidelity (although Positive Action was not unique in this respect among interventions included in our review). It may also reflect that Positive Action included classroom, whole-school and (in the Hawaii trial) external domain strategies delivered over multiple school years. Though Gatehouse [28] was similar to Positive Action in its focus on multiple systems, Gatehouse targeted adolescents, whereas Positive Action was delivered from KS2 and also included work with parents. Another possible explanation for our results is that effects for these interventions may take time to emerge. This is plausible given the developmental focus of many of these interventions, and evidence of links between early aggressive behaviour and later violence. [4,5] For example, there was some evidence that effects on aggressive behaviour in 4Rs began to emerge after the second intervention year. [19] While findings were somewhat contradictory across different outcomes for PATHS, there was some evidence that teachers of intervention students reported less aggression in later years of the intervention. [36] Another key feature of Positive Action was the use of a model that linked academic and health education to developmental concerns, both in terms of activities as well as in the underlying theory of change. Moving forward, intervention strategies that combine multiple domains over several years and that use both subject-specific learning alongside linking to developmental concerns may be more effective than classroomonly interventions, single-year interventions, or interventions that use literature alone; this should be a target for future research.
This systematic review has strengths and limitations. Identifying relevant studies was challenging often because of poor intervention description. We were unable to undertake meta-analysis or assessment of publication bias, though the preponderance of null results Authors have no competing interests to disclose.

CONTRIBUTIONS
GJMT undertook study screening and selection, led the meta-analyses and drafted the initial manuscript. TT undertook study screening and selection, extracted data and contributed to drafting the initial manuscript. AF undertook study screening and selection. JT and RC provided methodological and substantive advice. CB undertook study screening and selection, extracted data, and contributed to drafting the initial manuscript. All authors revised the manuscript and approved the final manuscript as submitted.
The authors acknowledge Ms Claire Stansfield for her assistance in designing and conducting the searches.

DATA SHARING STATEMENT
All data are publicly available.  23 Kupersmidt JB, Scull TM, Benson JW. Improving media message interpretation processing skills to promote healthy decision making about substance use: the effects of the middle school media ready curriculum. Journal of Health Communication 2012;17:546-63. 24 Patton G, Bond L, Carlin JB, et al. Promoting social inclusion in schools: grouprandomized trial of effects on student health risk behaviour and well-being. American Journal of Public Health 2006;96:1582-7. 25 Peersman G, Oliver S, Oakley A. EPPI-Center review guidelines: data collection for the EPIC database. London: EPPI-Centre Social Science Research Unit 1997. 26 Higgins JPT, Green S.  39 Wang C, Goldberg TS. Using children's literature to decrease moral disengagement and victimization among elementary school students. Psychology in the Schools 2017:No-Specified. 40 Brown EC, Low S, Smith BH, et al. Outcomes from a school-randomized controlled trial of steps to respect: A bullying prevention program. School Psychology Review 2011;40:423. 41 Beets    1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59   Classroom instruction and discussion on specific social and problem-solving skills followed by skills practice, reinforced during free play using a group cooperation game with review of behaviour and presentation of daily rewards. There is also a parent evening to engage families and opportunities for parents to engage with teachers. The intervention was delivered by teachers and special instructors. Integration was achieved by teaching study skills alongside social-emotional Education as usual  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47 o n l y 12% mother less than high school graduate, 8% father less than high school graduate; 36% mother unemployed, 10% father unemployed; 22% single-parent families; 18% receiving benefits; 20% less than $15,000/year in early 1990s education content, and was delivered over the course of one school year.

Positive Action Chicago
Chicago, USA An intervention to reduce conflict by improving students' social-emotional and thinking skills through a curriculum (including study skills), the establishment of a positive classroom environment and generalised positive social norms throughout the school environment. Lessons are grouped into three units addressing readiness and self-control, feelings and relationships, and interpersonal problem solving. These units cover five domains: 1. Self-control; 2. Emotional understanding; 3. Positive self-esteem; 4. Healthy relationships; and 5. Interpersonal problem-solving skills. The intervention is delivered by teachers supported by consultants, with 131 lessons delivered over three years (two to three times per week, 20 to 30 minutes each). Integration was achieved by linking study skills to social-emotional learning, by supporting teachers to include children's literature in reinforcing concepts, and by providing ideas to link PATHS to English, social studies and history lessons.

6
Information sources 7 Describe all information sources (e.g., databases with dates of coverage, contact with study authors to identify additional studies) in the search and date last searched.
Online File 1 Search 8 Present full electronic search strategy for at least one database, including any limits used, such that it could be repeated.

Online File 1
Study selection 9 State the process for selecting studies (i.e., screening, eligibility, included in systematic review, and, if applicable, included in the meta-analysis).

7
Data collection process 10 Describe method of data extraction from reports (e.g., piloted forms, independently, in duplicate) and any processes for obtaining and confirming data from investigators.

7
Data items 11 List and define all variables for which data were sought (e.g., PICOS, funding sources) and any assumptions and simplifications made.
Online File 1

Risk of bias in individual studies
12 Describe methods used for assessing risk of bias of individual studies (including specification of whether this was done at the study or outcome level), and how this information is to be used in any data synthesis.

Study selection
17 Give numbers of studies screened, assessed for eligibility, and included in the review, with reasons for exclusions at each stage, ideally with a flow diagram. 9, Figure 1 Study characteristics 18 For each study, present characteristics for which data were extracted (e.g., study size, PICOS, follow-up period) and provide the citations.
8-10, Table  1-3 Risk of bias within studies 19 Present data on risk of bias of each study and, if available, any outcome level assessment (see item 12). Table 2 Results of individual studies 20 For all outcomes considered (benefits or harms), present, for each study: (a) simple summary data for each intervention group (b) effect estimates and confidence intervals, ideally with a forest plot.

Strengths and limitations of this study
• We used an exhaustive search including 19 databases and 32 websites.
• We used an innovative method to describe key components in this class of interventions.
• However, it was challenging to identify studies for inclusion.
• Meta-analysis was not possible because of the diversity of outcomes and raters. Early aggression and anti-social behaviour are strongly linked to adult violent behaviour. [4,5] School-based health education can be effective in reducing violence. [6][7][8] However, school-based health education is increasingly marginal in many high-income countries, partly because of schools increasing focus on attainment-based performance metrics. In England specifically, health education is not a statutory subject, [9][10][11] and school inspectors have a limited focus on how schools promote student health. [12] One way to avoid such marginalisation is to integrate health education into academic lessons. For example, health-related content can be seamlessly integrated into existing academic lessons or discrete additional health education lessons can also include academic learning elements. This strategy may bring other benefits because: larger 'doses' may be delivered; students may be less resistant to health messages weaved into other subjects; and lessons in different subjects may reinforce each other. [13,14] Conversely, those teaching academic subjects may be uninterested or unqualified to teach health topics. Though theories of change in this class of interventions are diffuse, one important way in which they could be effective is by promoting developmental cascades involving the interplay of cognitive and non-cognitive skills. [15,16] Interventions integrating academic and health education could address violence by developing: social and emotional skills such as self-awareness, selfregulation, motivation, empathy and communication; [17] healthier social support or norms  [15,18,19]; knowledge of the costs [20] and consequences [21] of substance use; media literacy skills to critique harmful media messages; and modifying students' social norms about antisocial behaviours. [13,20,[22][23][24] Our work synthesising the theories of change underlying these interventions (Tancred et al., in press) identified that interventions aimed to integrate and thus erode boundaries between academic and health education, between students and teachers (so that relationships were improved and teachers might function more effectively as behavioural role models), and between classrooms and schools and schools and families (so that violence prevention messages communicated in classrooms might be reinforced by messaging in other settings).
Despite policy interest in these interventions, they have not previously been the subject of a specific systematic review. Previous systematic reviews have focused on socioemotional learning interventions or school-based interventions generally, [6][7][8] without considering interventions that specifically integrate with academic lessons as defined above.
Our focus on violence is informed by preliminary consultation, scoping work and logic model development suggesting that violence is an outcome especially amenable to these interventions. In the present review, we examined the characteristics of interventions that integrate academic and health education to prevent violence, and synthesised evidence for their effectiveness. That is, our research questions were: what are the overarching features relevant to integration of interventions that integrate academic and health education, and are these interventions effective at different key stages in reducing physical aggression and violence?

METHODS
This review was part of a larger evidence synthesis project on theories of change, process evaluations and outcome evaluations of integration of academic and health education for substance use and violence. We registered the protocol for this review on PROSPERO

Inclusion and exclusion
Studies were included regardless of publication date or language. We included randomised controlled trials of interventions integrating academic and health education, the former defined as specific academic subjects or general study skills. We defined as 'health education' education seeking to improve the health and wellbeing of students (including social and emotional learning and other forms of violence prevention). We included schoolbased interventions that seamlessly incorporated health education into existing academic lessons and interventions that provided discrete health education lessons with additional academic components. Interventions could be delivered by teachers or other school staff such as teaching assistants, but may also have been delivered by external providers, for example from the health, voluntary or youth service sectors. We did not include interventions solely addressing social conduct in the classroom; relationships with peers or staff; attitudes to education, school or teachers; or aspirations and life goals. Our definition also excluded interventions which: were delivered in mainstream subject lessons but did not aim to integrate health and academic education; trained teachers in classroom management without student curriculum components; or were delivered exclusively outside of classrooms, as these did not seek to integrate academic and health education. Interventions focusing on targeted health-related sub-populations (e.g. children with cognitive disabilities) were excluded as we were interested in universal interventions.
For this review, we focus on violence outcomes, defined as the perpetration or victimisation of physical violence including convictions for violent crime. While we preferred direct measures of physically violent and physically aggressive behaviours, we included outcomes that were a composite of physical and non-physical (e.g. verbal or CENTRAL, as all of our original study hits were recovered from these databases.

Study selection
Pairs of researchers double-screened titles and abstracts in sets of 50 references until 90% agreement was reached, with disagreements discussed at every stage. Subsequently, single reviewers screened each reference. We located the full texts of remaining references and undertook similar pairwise calibration with disagreements discussed, followed by single screening. Reports were translated into English where necessary. Using an existing tool [25] we extracted data independently in duplicate from included studies and assessed trials for risk of bias using a modified version of the Cochrane assessment tool. [26] Authors were contacted where study data were missing.

Synthesis methods
We undertook an intervention components analysis. [27] This was undertaken inductively by one researcher and audited by two other researchers, and used intervention descriptions to draw out similarities and differences in intervention design using an iterative method. Intervention descriptions were read and re-read and then coded manually. The goal of this analysis was to use a set of descriptors to characterise aspects of the integration of academic and health education in the intervention. Intervention descriptions were rarely detailed enough to permit 'deep' engagement with the specific content of the interventions provided in included evaluations. The intervention components analysis identified KS1 includes school years 1-2 (age 5-7 years); KS2 includes years 3-6 (age 7-11 years); KS3 includes years 7-9 (age 11-14 years); KS4 includes years 10-11 (age 14-16 years); and KS5 includes years 12-13 (age 16-18 years).
We could not formally assess publication bias because heterogeneity in outcome measurement precluded meta-analysis.

Patient and public involvement
Because this review focused on public health interventions that were generally preventive in nature, patients were not involved per se. However, stakeholders were extensively consulted in the development of research questions and in assessing the implications of the findings. In addition, findings were disseminated via stakeholder events, and a series of one-to-one consultations took place to ensure the relevance and salience of study findings.

RESULTS
In our original search, we found and screened 76,979 references, of which we retained 702 for full-text screening and were able to assess 690. Of 62 relevant reports included in the overall project, 10 evaluations of eight interventions were reported in 14 papers that considered violence and are reported in this review. Our update search yielded 2,355 references, of which we retained 41 for full-text screening and included six papers reporting three evaluations (Figure 1). This yielded a total of 13 evaluations reported in 20 papers.

Included studies and their quality
All trials randomised schools except the Bullying Literature Project, which randomised classrooms (Table 1). All evaluations were conducted in the USA, except for Gatehouse, [28] which was an Australian study. All control arms consisted of education-asusual or waitlist controls, though Second Step [29][30][31] offered a brief anti-bullying intervention with low takeup.
Interventions were diverse and are summarised below in the intervention components analysis. Only two interventions (Bullying Literature Project, [32] Youth Matters [33]) were wholly delivered by external staff. Several (Gatehouse, [28] Positive Action, [34] Steps to Respect [35]) linked classroom-based delivery to school-level work to support and reinforce implementation. PATHS [36] and 4Rs [19] also emphasised teachers' professional development.
Evaluation quality varied (Table 2). Appraisal was hampered by poor reporting of some aspects of trial methods. Only four studies reported evidence of low risk of bias for random generation of allocation sequence; the remainder were unclear. Only one study reported information on concealed allocation. In LIFT, [37] outcome assessors were blinded, resulting in low risk of bias in this domain, but all other interventions were of unclear risk of bias. All interventions included reasonably complete outcome data, and in only one evaluation did unit of analysis issues pose a risk of bias. In some studies such as Steps to Respect, follow-up was shorter than intervention length. Evaluations also differed in size, ranging in size from seven classrooms to 63 schools.

Intervention components analysis
This identified four themes describing included interventions: approach to integration, position of integration, degree of integration and point of integration. Included interventions are described in Table 1, and the components analysis is summarised in Table 3.

Timing of integration
Most interventions were multi-year, though two involved only one school year (LIFT, Bullying Literature Project).  Table 4.

Violence perpetration: KS2
Across the 10 evaluations reporting outcomes in this KS, effects were inconsistent, including within studies by rater.

DISCUSSION
While the integration of academic and health education remains a promising model for the delivery of school-based health education, randomised evaluations were variable in quality and did not consistently report evidence of effectiveness in reducing violence victimisation or perpetration. Evidence was concentrated in KS2, with few evaluations in KS3 or KS4. Moreover, evidence was stronger in quantity and in quality for violence perpetration as compared to victimisation. Unfortunately, evaluations that measured perpetration did not always also measure victimisation, preventing a meaningful comparison of consistency of effects.
Few interventions showed consistent signals of effectiveness. Though a formal moderator analysis was not possible, certain intervention models appear more effective than others. Specifically, evaluations of Positive Action in both Chicago [43] and Hawaii [41] showed consistently positive results across diverse measures. This may reflect the involvement of the intervention developer, a factor often associated with improved intervention fidelity (although Positive Action was not unique in this respect among interventions included in our review). It may also reflect that Positive Action included classroom, whole-school and (in the Hawaii trial) external domain strategies delivered over multiple school years. Though Gatehouse [28] was similar to Positive Action in its focus on multiple systems, Gatehouse targeted adolescents, whereas Positive Action was delivered from KS2 and also included work with parents. Another possible explanation for our results is that effects for these interventions may take time to emerge. This is plausible given the developmental focus of many of these interventions, and evidence of links between early aggressive behaviour and later violence. [4,5] For example, there was some evidence that  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   effects on aggressive behaviour in 4Rs began to emerge after the second intervention year. [19] While findings were somewhat contradictory across different outcomes for PATHS, there was some evidence that teachers of intervention students reported less aggression in later years of the intervention. [36] Another key feature of Positive Action was the use of a model that linked academic and health education to developmental concerns. That is to say, this intervention focused on improvements in academic engagement and study skills both enhancing, and being enhanced by, student health and wellbeing; this was a feature of intervention activities and of the underlying theory of change. Moving forward, intervention strategies that combine multiple domains over several years and that use both subject-specific learning alongside linking to developmental concerns may be more effective than classroomonly interventions, single-year interventions, or interventions that use literature alone; this should be a target for future research.
This systematic review has strengths and limitations. Identifying relevant studies was challenging often because of poor intervention description. We were unable to undertake meta-analysis or assessment of publication bias, though the preponderance of null results suggests that projects with non-significant findings are being published. Finally, the diversity of outcome measures and of raters precludes a complete and consistent picture of the effectiveness of these interventions via standardised measures. For example, measures that included physical violence and aggression were at times combined with verbal forms of interpersonal violence; while we preferred measures of physical violence and physical aggression, we included outcomes where these behaviours were included as part of a composite. Consistency and clarity in outcome reporting will be especially important as 'core outcome sets' become relevant in planning evaluations in public health and social science. Most studies focused on bullying, while evaluations of Positive Action [41,43] generally provided the most direct test of violent behaviours specifically.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59

COMPETING INTERESTS
Authors have no competing interests to disclose.

Linking the Interests of Families and Teachers (LIFT)
Pacific Northwest, USA Reid 1999[37] 3 schools, 214 students (IG); 3 schools, 147 students (CG) Students enrolled in year 6 and followed up over seven years 49% female, 51% male 86% White, 14% ethnic minority Classroom instruction and discussion on specific social and problem-solving skills followed by skills practice, reinforced during free play using a group cooperation game with review of behaviour and presentation of daily rewards. There is also a parent evening to engage families and opportunities for parents to engage with teachers. The intervention was delivered by teachers and special instructors. Integration was achieved by teaching study skills alongside social-emotional Education as usual  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47 o n l y 12% mother less than high school graduate, 8% father less than high school graduate; 36% mother unemployed, 10% father unemployed; 22% single-parent families; 18% receiving benefits; 20% less than $15,000/year in early 1990s education content, and was delivered over the course of one school year.

Positive Action Chicago
Chicago, USA Teachers provide lessons covering six units: selfconcept; positive actions for mind and body; positive social-emotional actions; managing oneself; being honest with oneself; and continually improving oneself. Content includes 140 lessons per grade per year from years 1 to 13. In addition, an implementation coordinator and school climate team are appointed to support the intervention. The intervention is primarily delivered by teachers and school staff; in both trials, this was supported by extensive professional development and training. Integration was achieved by linking academic learning to social-emotional and health-related learning, e.g. by including content on problem solving and study skills alongside positive actions for mind and body, and by encouraging teachers to reflect Positive Action content in academic lessons.

Positive Action Hawaii
Hawaii, USA  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47 o n l y An intervention to reduce conflict by improving students' social-emotional and thinking skills through a curriculum (including study skills), the establishment of a positive classroom environment and generalised positive social norms throughout the school environment. Lessons are grouped into three units addressing readiness and self-control, feelings and relationships, and interpersonal problem solving. These units cover five domains: 1. Self-control; 2. Emotional understanding; 3. Positive self-esteem; 4. Healthy relationships; and 5. Interpersonal problem-solving skills. The intervention is delivered by teachers supported by consultants, with 131 lessons delivered over three years (two to three times per week, 20 to 30 minutes each). Integration was achieved by linking study skills to social-emotional learning, by supporting teachers to include children's literature in reinforcing concepts, and by providing ideas to link PATHS to English, social studies and history lessons.

Searches
Our search strategy will be informed by those used in previous systematic reviews focused on school interventions addressing alcohol, smoking, drug use and violence. The studies sought by this review are not likely to be reliably indexed in databases with controlled vocabularies. So we anticipate our searches involving a large number of free text terms. We will take the following three key concepts from the inclusion criteria to develop the search string: health education; integration with academic learning; and children and young people or schools. The combination of these concepts is sensitive enough to include all available studies regardless of study design. The three concepts will be linked by the Boolean operator "AND". Our searches will involve different free text and controlled vocabulary terms for each of these two concepts linked by the Boolean operator "OR". In our use of terms relating to health education, we will use a very broad array of terms to minimise the risk of publication bias. We will not restrict the searches by date, language or publication type. We will search the following databases from inception to present: ASSIA;  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  Types of study to be included In order to address RQ 1 and 3, we will include studies reporting on process evaluations. This would include studies reporting on planning, delivery, receipt or causal pathways using quantitative and/or qualitative data. These studies may report exclusively on process evaluations or report process alongside outcome data. In order to address RQ 1 and 4, we will include studies reporting on outcome evaluations, using randomized controlled trials allocating schools, classes or individuals. Controls will be students, classes or schools allocated randomly to a control group in which no or usual school health and academic education is delivered, or to a control group including another 'active' intervention. In order to address RQ2 we will draw on included process and outcome evaluations as defined above which include descriptions of intervention theories of change or logic models. In order to address RQ5, we will draw on syntheses of all of the above study types.

Condition or domain being studied
The proposed review focuses on substance use (alcohol consumption, smoking and drug use) and violence since these are important, inter-correlated outcomes which are addressed by interventions sharing common theories of change. Alcohol has been suggested to be the most harmful substance in the UK. Treating alcohol-related diseases costs the NHS in England an estimated £3.5 billion annually. The total annual societal costs of alcohol use in England are estimated at £21 billion. Alcohol related harms are strongly stratified by socioeconomic status (SES). Early initiation of alcohol use and excessive drinking are linked to later heavy drinking and alcohol-related harms and poor health. Alcohol use among young people is associated with truancy, exclusion, and poor attainment, as well as unsafe sexual behaviour, unintended pregnancies, youth offending, accidents/ injuries and violence. Preventing young people from taking up smoking is another key public health objective with 80,000 deaths due to smoking each year. In 2005-6, smoking cost the NHS £5.2 billion and wider costs amounted to £96 billion. Of smokers, 40% start in secondary school and early initiation is associated with heavier and more enduring smoking and greater mortality. Smoking among young people is a major source of health inequalities. Among UK 15-16 years olds 25% have used cannabis and 9% have used other illicit drugs. Early initiation and frequent use of 'soft' drugs may be a potential pathway to more problematic drug use in later life. Drugs such as cannabis and ecstasy are associated with increased risk of mental health problems, particularly among frequent users. Young people's drug use is also associated with accidental injury, self-harm, suicide and other 'problem' behaviours. The proposed review's other primary outcome is violence. The prevalence, harms and costs of violence among young people mean that addressing this is a public health priority. One UK study found that 10% of young people aged 11-12 reported carrying a weapon and 8% admitted attacking someone with intent to hurt them seriously. By age 15-16, 24% of students report that they have carried a weapon and 19% reported attacking someone with the intention to hurt them seriously. There are also links between aggression and anti-social behaviours in youth and violent crime in adulthood. As well as leading to further health inequalities, the economic costs to society of youth aggression, bullying and violence are high. For example, the total cost of crime attributable to conduct problems in childhood has been estimated at about £60 billion a year in England and Wales.

Participants/population
We will include studies conducted where a majority of participants are children and young people aged 4-18 years attending schools.

Comparator(s)/control
In order to address RQ 1 and 4, we will include studies reporting on outcome evaluations, using randomized controlled trials allocating schools, classes or individuals. Controls will be students, classes or schools allocated randomly to a control group in which no or usual school health and academic education is delivered, or to a control group including another 'active' intervention.

Context
Schools serving students age 4-18 years.

Primary outcome(s)
We will include studies addressing one or more of the following primary review outcomes: smoking; alcohol use; legal or illegal drug use; and violence.

Timing and effect measures
We will include studies addressing one or more of the following primary review outcome measures: smoking (e.g. salivary cotinine, carbon monoxide levels, self-reported use of cigarettes); alcohol use (e.g. selfreported alcohol consumption via questionnaires or diaries); legal or illegal drug use (e.g. self-reported drug use); and violence (self-reported violence perpetration -for example, carried weapon, got into a fight -and victimisation). Informed by existing systematic reviews focused on substance use and violence among young people, outcome measures may draw on dichotomous or continuous variables, and self-report or observational data. They may use measures of frequency (monthly, weekly or daily), the number of episodes of use or an index constructed from multiple measures. Alcohol measures may examine alcohol consumption or problem drinking. Drug outcomes may examine drugs in general or specific illicit drugs, including drug convictions. Measures of violent and aggressive behaviour may examine the perpetration or victimization of physical violence including convictions for violent crime. We will regard follow-up times of less than three months, three months to one year and more than one year post-intervention as different outcomes.

Secondary outcome(s)
Though not an inclusion criterion, we will assess academic attainment as a secondary outcome.

Timing and effect measures
Academic attainment might be measured as e.g. student standardised academic test scores, IQ tests or other validated scales; school academic performance.

Data extraction (selection and coding)
Selection of studies Search results will be downloaded into EPPI-Reviewer 4. An inclusion criteria worksheet with guidance notes will be prepared and piloted by two reviewers screening the same 50 references. Where the two reviewers disagree, they will meet to discuss this and if possible reach a consensus. If the reviewers cannot reach consensus regarding inclusion of a specific article, judgement for selection will be referred to a third reviewer. If necessary, we will organise translation of papers published in languages in which we are not proficient. After piloting and any refinements, each reference will be screened on the basis of title and abstract for potential inclusion by one reviewer, using text-mining to prioritise screening the most relevant studies first. Full reports will be obtained for those references judged as meeting our inclusion criteria or where there is  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n l y PROSPERO International prospective register of systematic reviews insufficient information from the title and abstract to judge inclusion. A second round of screening will then occur focused on full study reports to determine which studies are included in the review. We will maintain a record of the selection process for all screened material. Data extraction and management Two reviewers will independently extract data from all studies meeting the inclusion criteria, using a piloted data extraction form with guidance developed for this review. Where the two authors disagree, they will meet to discuss this and if possible reach a consensus. If the reviewers cannot reach consensus regarding the particulars of data extraction for a specific study, judgement will be referred to a third reviewer. Included studies will be described using the EPPI-Centre classification system for health promotion and public health research, supplemented by additional codes developed for this review. For all studies where relevant, we will extract information pertaining to: basic study details (individual and organizational participant characteristics, study location, timing and duration, research questions or hypotheses); study design and methods (design, allocation, blinding, sample size, control of confounding, accounting for data clustering, data collection, attrition, analysis); intervention characteristics (timing and duration, programme development, theoretical framework/logic model, content and activities, providers and details of any intervention offered to the control group); process evaluation of the intervention (feasibility, fidelity/quality, intensity, coverage/accessibility, acceptability, mechanism and context using an adapted version of an existing tool); outcome measures at follow-up(s) (reliability of measures, effect size both overall and where available by age, sex, socio-economic status and ethnic sub-group). The two reviewers will independently enter data from the data extraction forms into EPPI-Reviewer 4. If included studies are reported in languages that cannot be translated by the review team, a review author will complete the data extraction form in conjunction with a translator.
Published reports may be incomplete in a wide range of ways. For example: they may not report sufficient detail about their participants for our equity analysis; they may not present information on all the outcomes that were measured (possibly resulting in outcome reporting bias); they may not provide sufficient information about the intervention for accurate characterisation; and they may not report the necessary statistical information for the calculation of effect sizes. In all cases where there is a danger of missing data affecting our analysis, we will contact authors of papers wherever possible to request additional information.
If authors are not traceable or sought information is unavailable from the authors within two months of contacting them, we will record that the study information is missing on the data extraction form, and this will be captured in our risk of bias assessment of the study.

Risk of bias (quality) assessment
We will assess the quality of theories of change using a modified version of the criteria developed in our ongoing NIHR-funded systematic review of positive youth development interventions, which for example assess the clarity with which constructs are defined and inter-related. We will assess the quality of the qualitative and quantitative elements of process evaluations using standard Critical Appraisal Skills Program and EPPI-Centre tools. These address the rigour of: sampling; data collection; data analysis; the extent to which the study findings are grounded in the data; whether the study privileges the perspectives of participants; the breadth of findings; and depth of findings. These are then used to assign studies to two categories of 'weight of evidence'. First, reviewers will assign a weight (low, medium or high) to rate the reliability or trustworthiness of the findings (the extent to which the methods employed were rigorous/could minimise bias and error in the findings). Second, reviewers will assign an additional weight (low, medium, high) to rate the usefulness of the findings for shedding light on factors relating to the research questions. Guidance will be given to reviewers to help them reach an assessment on each criterion and the final weight of evidence. The two reviewers will then meet to compare their assessments, resolving any differences through discussion and, where necessary, by calling on a third reviewer. For outcome evaluations, we will assess risk of bias within each included study using the tool outlined in the Cochrane Handbook for Systematic Reviews of Interventions. For each study, two reviewers will independently judge the likelihood of bias in seven domains: sequence generation; allocation concealment; blinding (of participants, personnel, or outcome assessors); incomplete outcome data; selective outcome reporting; and other sources of bias (e.g. recruitment bias in cluster-randomised studies); and intensity/type of comparator. Each study will subsequently be identified as 'high risk', 'low risk' or 'unclear risk' within each domain. In cases of  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n l y PROSPERO International prospective register of systematic reviews disagreement, the reviewers will meet to seek consensus but where they cannot, we will refer judgement to a third reviewer. We will assess reporting bias according to Sterne's guidance. We will reduce the effect of reporting bias by focusing synthesis on studies rather than publications, avoiding duplicated data. Following the Cho statement on redundant publications, we will attempt to detect duplicate studies and, if multiple articles report on the same study, we will extract data only once. We will prevent location bias by searching across multiple databases. We will prevent language bias by not excluding any article based on language.
Strategy for data synthesis RQ1 and 2: Thematic synthesis of intervention descriptions and process data: Using thematic synthesis methods we will undertake a number of syntheses. Intervention descriptions (RQ1) and theories of change (RQ2) will first be analysed to develop a taxonomy of interventions integrating health and academic education. Syntheses of theories of change (RQ2) and process evaluations (RQ3) will be used to understand potential mechanisms of action. Syntheses of process evaluations (RQ3) will be used to understand: characteristics of interventions, participants and context acting as potential barriers and facilitators of implementation and receipt (RQ2); and an assessment of potential applicability to the UK. These syntheses will not be restricted to studies judged to be of high quality. Instead conclusions drawing on poorer quality evidence will be given less interpretive weight. First, the reviewers will prepare detailed evidence tables to describe: the methodological quality of each study; details of the intervention examined; study site/population; and full findings. Second, the two reviewers will undertake pilot analysis of two studies. The reviewers will read and re-read data contained within the evidence tables relating to the two high-quality studies, applying line-by-line codes to capture the content of the data. They will draft memos explaining these codes. Coding will begin with in-vivo codes which closely reflect the words used in findings sections. The reviewers will then group and organise codes, applying axial codes reflecting higher-order themes. The two reviewers will meet to compare and contrast their coding of these first two high-quality studies, developing an overall set of codes. Third, the two reviewers will go on to code the remaining studies drawing in the agreed set of codes but developing new in-vivo and axial codes as these arise from the analytical process, and again writing memos to explain these codes. At the end of this process, the two reviewers will meet to compare their sets of codes and memos. They will identify commonalities, differences of emphasis and contradictions with the aim of developing an overall analysis which draws on the strengths of the two sets of codes and which resolves any contradictions or inconsistencies, drawing on a third reviewer if necessary to achieve this. Through this process will be developed an explanatory framework to understand factors affecting implementation. Results will be presented to PPI stakeholders who will determine which interventions they think are applicable to the UK. RQ4: Synthesis of outcome data: We will first produce a narrative account of the effectiveness of these types of interventions. This narrative synthesis will be ordered by outcome then within this by age group, intervention type and follow-up time.
Outcomes will be categorised into violence, smoking tobacco, drinking alcohol, using other drugs and academic attainment. Age will be categorised by the key-stage age-ranges used in the English educational system. Categorisation by intervention type will be informed by our prior thematic synthesis of intervention descriptions and theories of change through which we will have produced a taxonomy of interventions. This taxonomy may refer to: whether interventions incorporate health education into other, mainstream school subjects or aim for health education lessons to include teaching of academic as well as health knowledge and skills; lesson frequency; style of delivery; or other aspects of interventions which appear to be critical from our preliminary synthesis. We will describe study results in the 'characteristics of included studies' table, or enter the data into additional tables. We will then produce forest plots for each of our review outcomes, with separate plots for different outcomes and follow-up times, age groups and intervention types. Plots will include point estimates and standard errors for each study, such as risk ratios for dichotomous outcomes or standardised mean differences for continuous outcomes. Once we know the number of studies and the extent of heterogeneity among the studies (as determined both by a Cochran's Q test and inspection of the I2), we will make a decision whether to calculate pooled effect sizes. The results of statistical tests will be evaluated in accordance with the Cochrane handbook. If an indication of substantial heterogeneity is determined (e.g. study-level I² value greater than 50%) that cannot be explained through meta-regressions, then we will not produce a pooled estimate and will present only the narrative summary. When studies are found to be statistically heterogeneous, we will use a random-effects model; otherwise we will use a fixed-effects model. When using the random-effects model, we will conduct a sensitivity check by using the fixed-effect model to reveal differences in results. If we do produce pooled estimates, we will consider using a multilevel meta-analysis model to synthesise effect sizes. This is because outcome evaluations are likely to include multiple measures of conceptually related outcomes and multi-level metaanalysis improves on previous strategies for dealing with multiple relevant effect sizes per study, such as meta-analysing within studies or choosing one effect size, by including all relevant effect sizes but adjusting for inter-dependencies within studies. Unlike multivariate meta-analysis, it does not require the variancecovariance matrix of included effect sizes to be known. We will estimate separate models for substance use, violence and educational attainment outcomes. We will estimate separate models for substance use, violence and educational attainment outcomes, and for different age-ranges. We will examine substance use outcomes together in one analysis, as well as separated into smoking, alcohol, illicit drug use and any 'omnibus' measures of substance use. We will regard follow-up times of less than three months, three months to one year and more than one year post-intervention as different outcomes. We will run these models for interventions overall and where sufficient studies are found we will run separate models for different intervention categories and comparators. This categorisation will be informed by the taxonomy derived from our prior synthesis of intervention descriptions and theories of change. Where meta-analyses are performed, we will include pooled effect sizes in forest plots, with the individual study point estimates weighted by a function of their precision. Prior to synthesis, we will check for correct analysis (where appropriate) by cluster and report values of: intracluster correlation coefficients (ICC), cluster size, data for all participants or effect estimates and standard errors. Where proper account has not been taken of data clustering, we will correct for this by inflating the standard error by the square root of the design effect. Where ICCs are not reported, we will contact authors to request this information or impute one, based on values reported in other studies. Where imputation is necessary, we will undertake sensitivity analyses to assess the impact of a range of possible values. In other instances of missing data (such as missing population information), it may not be possible to include a study in a particular analysis if, for example, it is impossible to classify the population using our equity tool. We will use the GRADE approach as described in the Cochrane Handbook for Systematic Reviews of Interventions to present the quality of evidence and 'Summary of findings' tables. The downgrading of the quality of a body of evidence for a specific outcome will be based on five factors: limitations of study; indirectness of evidence; inconsistency of results; precision of results; and publication bias. The GRADE approach specifies four levels of quality (high, moderate, low and very low). If sufficient studies are found, we will draw funnel plots to assess the presence of possible publication bias (trial effect versus standard error). While funnel plot asymmetry may indicate publication bias, this can be misleading with a small number of studies. We will discuss possible explanations for any asymmetry in the review in light of our number of included studies. We will undertake a sensitivity analysis to explore whether the findings of the review are robust in light of the decisions made during the review process. We will also assess the impact of risk of bias in the included studies via restricting analyses to studies deemed to be at low risk of selection bias, performance bias and attrition bias. Where data allow, we will undertake additional exploratory meta-analyses to determine intervention effects on theorised intermediate outcomes (such as knowledge, skills, social norms) to examine the plausibility that these might mediate or otherwise precede behavioural effects. Such analyses will be informed by the synthesis of theories of change and process evaluation findings to avoid data-dredging.

Analysis of subgroups or subsets
If we consider that we have unexplained statistical heterogeneity in any of our study groupings, we will investigate this further using subgroup and sensitivity analyses. We will analyse the effectiveness of the subset of interventions identified by stakeholders as relevant to the UK context. Where possible we will examine intervention effects by participant sub-groups (for example in terms of age, socioeconomic status, sex and ethnicity) and contexts (for example in terms of school-level deprivation) in order to examine potential impacts on health inequalities. This will draw on existing methods involving an 'equity lens' employing meta-analyses of subgroup effects from included studies and/or meta-regression drawing on studies with different participant or site characteristics to assess whether these moderate effects. RQ5: Meta-regression and qualitative comparative analysis: If at least ten studies are found, we will employ meta-regression using Stata to investigate what factors moderate intervention effects in order to examine what characteristics of intervention, deliverers, contexts and students moderate effectiveness (RQ5). It may not be feasible to apply this method if we judge there are too many confounders or insufficient data, or if meta-regression is unable to account for interdependencies in complex interventions. Hence, we will complement meta-regression with qualitative comparative analysis, adapted for use in research synthesis to assess necessary and sufficient conditions for intervention effectiveness. As with our current review of positive youth development, the use of initial hypotheses derived from work addressing RQ 2 and 3 will protect us from 'dredging' the data for spurious statistically significant results. The required steps of 'qualitatively anchoring' outcomes in qualitative comparative analysis will ensure that changes in outcomes are meaningful and not simply statistical artefacts with little relevance for decision-making. We should stress that meta-regression and qualitative comparative analysis will be exploratory, hypothesis-building analyses since these will draw on observational rather than experimental comparisons. PROSPERO This information has been provided by the named contact for this review. CRD has accepted this information in good faith and registered the review in PROSPERO. CRD bears no responsibility or liability for the content of this registration record, any associated files or external websites.

METHODS
Protocol and registration 5 Indicate if a review protocol exists, if and where it can be accessed (e.g., Web address), and, if available, provide registration information including registration number.
6 Eligibility criteria 6 Specify study characteristics (e.g., PICOS, length of follow-up) and report characteristics (e.g., years considered, language, publication status) used as criteria for eligibility, giving rationale.

6
Information sources 7 Describe all information sources (e.g., databases with dates of coverage, contact with study authors to identify additional studies) in the search and date last searched.
Online File 1 Search 8 Present full electronic search strategy for at least one database, including any limits used, such that it could be repeated.

Online File 1
Study selection 9 State the process for selecting studies (i.e., screening, eligibility, included in systematic review, and, if applicable, included in the meta-analysis).

7
Data collection process 10 Describe method of data extraction from reports (e.g., piloted forms, independently, in duplicate) and any processes for obtaining and confirming data from investigators.

7
Data items 11 List and define all variables for which data were sought (e.g., PICOS, funding sources) and any assumptions and simplifications made.
Online File 1

Risk of bias in individual studies
12 Describe methods used for assessing risk of bias of individual studies (including specification of whether this was done at the study or outcome level), and how this information is to be used in any data synthesis.  Risk of bias across studies 15 Specify any assessment of risk of bias that may affect the cumulative evidence (e.g., publication bias, selective reporting within studies).

8
Additional analyses 16 Describe methods of additional analyses (e.g., sensitivity or subgroup analyses, meta-regression), if done, indicating which were pre-specified.

Study selection
17 Give numbers of studies screened, assessed for eligibility, and included in the review, with reasons for exclusions at each stage, ideally with a flow diagram. 9, Figure 1 Study characteristics 18 For each study, present characteristics for which data were extracted (e.g., study size, PICOS, follow-up period) and provide the citations.
8-10, Table  1-3 Risk of bias within studies 19 Present data on risk of bias of each study and, if available, any outcome level assessment (see item 12). Table 2 Results of individual studies 20 For all outcomes considered (benefits or harms), present, for each study: (a) simple summary data for each intervention group (b) effect estimates and confidence intervals, ideally with a forest plot.

11-16
Synthesis of results 21 Present results of each meta-analysis done, including confidence intervals and measures of consistency. N/A Risk of bias across studies 22 Present results of any assessment of risk of bias across studies (see Item 15). N/A Additional analysis 23 Give results of additional analyses, if done (e.g., sensitivity or subgroup analyses, meta-regression [see Item 16]). N/A DISCUSSION Summary of evidence 24 Summarize the main findings including the strength of evidence for each main outcome; consider their relevance to key groups (e.g., healthcare providers, users, and policy makers).