Article Text


What factors contribute to positive early childhood health and development in Australian Aboriginal children? Protocol for a population-based cohort study using linked administrative data (The Seeding Success Study)
  1. Kathleen Falster1,2,3,
  2. Louisa Jorm3,4,
  3. Sandra Eades5,
  4. John Lynch6,
  5. Emily Banks1,2,
  6. Marni Brownell7,
  7. Rhonda Craven8,
  8. Kristjana Einarsdóttir9,
  9. Deborah Randall3,4
  10. on behalf of the Seeding Success Investigators
  1. 1National Centre for Epidemiology and Population Health, Australian National University, Canberra, Australia
  2. 2The Sax Institute, Sydney, Australia
  3. 3Centre for Big Data Research in Health, University of New South Wales, Sydney, Australia
  4. 4Centre for Health Research, School of Medicine, University of Western Sydney, Campbelltown, Australia
  5. 5Baker IDI Heart and Diabetes Institute, Melbourne, Australia
  6. 6School of Population Health, University of Adelaide, Adelaide, Australia
  7. 7Manitoba Centre for Health Policy, University of Manitoba, Winnipeg, Canada
  8. 8Institute of Positive Psychology and Education, Australian Catholic University, Sydney, Australia
  9. 9Telethon Kids Institute, University of Western Australia, Perth, Australia
  1. Correspondence to Dr Kathleen Falster; kathleen.falster{at}


Introduction Australian Aboriginal children are more likely than non-Aboriginal children to have developmental vulnerability at school entry that tracks through to poorer literacy and numeracy outcomes and multiple social and health disadvantages in later life. Empirical evidence identifying the key drivers of positive early childhood development in Aboriginal children, and supportive features of local communities and early childhood service provision, are lacking.

Methods and analysis The study population will be identified via linkage of Australian Early Development Census data to perinatal and birth registration data sets. It will include an almost complete population of children who started their first year of full-time school in New South Wales (NSW), Australia, in 2009 and 2012. Early childhood health and development trajectories for these children will be constructed via linkage to a range of administrative data sets relating to birth outcomes, congenital conditions, hospital admissions, emergency department presentations, receipt of ambulatory mental healthcare services, use of general practitioner services, contact with child protection and out-of-home care services, receipt of income assistance and fact of death. Using multilevel modelling techniques, we will quantify the contributions of individual-level and area-level factors to variation in early childhood development outcomes in Aboriginal and non-Aboriginal children. Additionally, we will evaluate the impact of two government programmes that aim to address early childhood disadvantage, the NSW Aboriginal Maternal and Infant Health Service and the Brighter Futures Program. These evaluations will use propensity score matching methods and multilevel modelling.

Ethics and dissemination Ethical approval has been obtained for this study. Dissemination mechanisms include engagement of stakeholders (including representatives from Aboriginal community controlled organisations, policy agencies, service providers) through a reference group, and writing of summary reports for policy and community audiences in parallel with scientific papers.


This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Strengths and limitations of this study

  • This large, retrospective cohort study—constructed from linked, administrative data—will include an almost complete population of children born in the state of New South Wales, Australia; this will enable investigation of small population groups, such as Aboriginal children, and minimise selection bias.

  • This study will apply quasi-experimental methods to the analysis of routinely collected data to assess the effectiveness of extant programmes to address early childhood disadvantage.

  • The study uses only routinely collected data, and is subject to the limitations of these data. However, the linked data will be drawn from across multiple sectors and agencies and will include multiple measures for some key constructs to better capture diverse social and economic backgrounds.


Aboriginal Australians experience multiple social and health disadvantages from the prenatal period onwards.1 Infant2 and child3 mortality rates are higher among Aboriginal children, as are well-established influences on poor health, cognitive and education outcomes,4–6 including premature birth and low birth weight,7–9 being born to teenage mothers7 and socioeconomic disadvantage.1 ,8 Addressing Aboriginal early life disadvantage is of particular importance because of the high birth rate among Aboriginal people10 and subsequent young age structure of the Aboriginal population.11 Recent population estimates suggest that children under 10 years of age account for almost a quarter of the Aboriginal population compared with only 12% of the non-Aboriginal population of Australia.11

By school entry, 43–47% of Aboriginal children have markers of developmental vulnerability.12 ,13 In 2009, the first-ever national census of childhood development at school entry showed that Aboriginal children were 2–3 times more likely than non-Aboriginal children to be developmentally vulnerable—defined as an Australian Early Development Census (AEDC) score below the 10th centile—on one or more domains.14 The Longitudinal Survey of Australian Children reported similar disparities for cognitive outcomes among Aboriginal children aged 4–5 years, although the number of Aboriginal children was very small and not representative of the Aboriginal population.15 There is currently a dearth of empirical research that identifies the drivers of positive early childhood health and development in Aboriginal children, or characterises vulnerable developmental trajectories.

Developmental vulnerability at school entry tracks through to poor literacy and numeracy outcomes across all schooling years.16–18 Results from the Programme for International Student Assessment, conducted every 3 years between 2000 and 2012, show consistently low achievement levels among Aboriginal secondary students in maths, literacy and science, including a recent decline in mathematical achievement.17 ,19 ,20 In 2012, more than 25% of year 9 Aboriginal students performed below the minimum standard for reading and numeracy compared with 5–6% of non-Aboriginal children.16 Aboriginal children are more likely to leave school early and less than half of the Aboriginal children who start high school complete year 12, compared with almost 80% of non-Aboriginal children.1 In older Aboriginal Australians, this trajectory of disadvantage manifests as limited education and life opportunities, unemployment,21 poor health and premature mortality.21

In 2008, the Council of Australian Governments committed to reducing Indigenous disadvantage and set six ‘Closing the Gap’ targets that aim to redress inequalities in life expectancy, child mortality, education and employment.22 Despite this political will, progress towards some of these—including numeracy and literacy outcomes—has been disappointing.23 Until the substantial inequalities in perinatal and early childhood disadvantage, health and development are reduced, and what ‘seeds success’ is identified, it is likely that the education and life prospects of Aboriginal children will remain poor.

What works in ‘seeding success’?

There is very limited empirical evidence about what works to promote positive early childhood health and development in Australian Aboriginal children, despite an ever-growing number of local, statewide and nationwide—universal and targeted—early childhood programmes and services. Furthermore, these services rarely undergo rigorous empirical evaluation due to limited resources. Qualitative research studies suggest numerous ways to improve access—defined as the opportunity for children and families to participate and fully experience the benefits of a programme, affordability, suitability and sufficient quality—to early childhood services for Aboriginal children and their families.24 Some examples include: provision of transport; locating services in areas where other daily activities occur (eg, schools); provision of low-cost or no-cost services; employing, training and retaining Aboriginal staff; provision of culturally competent and secure services; community involvement in the planning and delivery of services; and provision of flexible, comprehensive and continuous services.24 Although some Aboriginal families prefer to use mainstream instead of Aboriginal-specific services, choice is another facilitator of access.24 Furthermore, it remains unknown as to whether mainstream early childhood services with proven effectiveness in non-Aboriginal populations confer the same benefits to Aboriginal children.

How can routinely collected data help to address the evidence gaps?

There has been a recent surge of interest in Australia and internationally in using population-wide linked administrative data sets to better understand the factors that promote positive early childhood health and development25 ,26 and to evaluate the impact of early childhood programmes, services and policy changes in the ‘real world’.27–29

The Developmental Pathways in Western Australian Children project is an existing data linkage project investigating the pathways to health and well-being, education and juvenile delinquency outcomes in Western Australian children and young people.30–36 In South Australia and the Northern Territory, the Early Childhood Development Demonstration Project is a recent linkage project aiming to investigate adverse early life risks for healthy development in children at a population level, using administrative data.37 Both are using the AEDC, an adaptation of the Canadian Early Development Instrument for use in Australia.38 However, these studies are not specifically focussed on Aboriginal children, including how the early childhood development of Aboriginal children varies across urban and rural geographies or how features of local communities and service provision influence the development of Aboriginal children.

Randomised controlled trials (RCTs) are the gold standard for determining the efficacy of interventions; however, such evidence for early childhood interventions is often not available. In such cases, the application of innovative statistical methods to the analysis of routinely collected data may present the best opportunity to estimate effectiveness.39 Canadian, UK and Australian researchers have undertaken innovative examples of quasi-experimental evaluations of child development interventions.40–43 The national evaluation of UK Sure Start Local Programmes (an area-based early intervention for children living in deprived communities) is an exemplar of how quasi-experimental observational study methods can be used in the absence of RCT data. Comparison of early childhood development outcomes for children living in Sure Start areas with those in propensity-matched non-Sure Start areas showed the beneficial effects of the programmes.43 In Australia, a similar study design was implemented to evaluate the impact of Communities for Children, an Australian area-based initiative designed to enhance the development of young children living in disadvantaged communities.44 In Canada, analysis of linked population health and welfare data sets, including information on individual programme participation, has enabled evaluation of health service initiatives such as the Manitoba Healthy Baby Program41 and use of a screening tool for out-of-home care risk among newborns.42 Following this, the Pathways to Health and Social (PATHS) Equity for Children research programme proposes a novel population-based focus to understanding what ‘works’ to reduce inequities across a wide range of outcomes for children, including health, development and education.29

Our project, the Seeding Success Study, will capitalise on recent improvements in the availability of linked administrative data in Australia, including Medicare Australia data relating to general practitioner (GP) services, and data about participation in early childhood services. In partnership with researchers from related projects in Canada, the UK, South Australia, the Northern Territory and Western Australia, Aboriginal organisations and policymakers, we will analyse whole-of-population data for New South Wales (NSW) to investigate the determinants of positive early childhood development in Aboriginal children, and assess the impacts of two ‘real-world’ programmes that were implemented under circumstances where evidence of their efficacy was unable to be derived from RCTs: the NSW Aboriginal and Maternal Infant Health Service (AMIHS)45 and the NSW Department of Family and Community Services (FACS) Brighter Futures Program.46 Early evaluations of these programmes suggested some positive changes in proximal outcomes related to their objectives.45 ,47 ,48 However, each of these evaluations was limited by one or more of the following: use of single data sets, less than 2 years of outcome data and/or issues of confounding and selection bias. Finally, neither evaluation was able to assess longer term outcomes such as early childhood health and development.

The Seeding Success Study will address key evidence gaps using linked, population person-level health, community services, welfare and development data for a large cohort of children who started school in NSW in 2009 and 2012. It will address the following research questions:

  1. What are the social, perinatal and early childhood health factors that promote positive early childhood development in Aboriginal and non-Aboriginal children?

  2. Is there geographic variation in positive early childhood development in Aboriginal and non-Aboriginal children? If so, what area-level factors contribute to this variation?

  3. Do Aboriginal children living in areas with an Aboriginal Maternal and Infant Health Service (AMIHS) have better health and development outcomes in the first 5 years of life than Aboriginal children not living in areas with an AMIHS?

  4. Do eligible children who have participated in the Brighter Futures Program have better health and development outcomes in the first 5 years of life than eligible children who have not participated in the Brighter Futures Program?

Methods and analysis

Data sources

Linked records will be requested from the following data sources for members of the study population (figure 1).

Figure 1

Overview of data sources.

The AEDC is a population measure of children's development in their first year of school. It is implemented nationally in Australia every 3 years, starting in 2009.14 Prior to 2014, this instrument was known as the Australian Early Development Index (AEDI). Since this study will only use data collected in 2009 and 2012, the data source will be referred to as the AEDI in the remainder of this paper. In NSW, AEDI data were collected for 87 170 and 94 323 children in 2009 and 2012, respectively.12 ,13 This is estimated to be approximately 97% of the NSW school starter population in 2009 and 2012 covering both public and private schools. A teacher who had known the child for at least 1 month completed the AEDI checklist. Teachers provided information about the child's development in five domains, including: (1) physical health and well-being; (2) social competence; (3) emotional maturity; (4) language and cognitive skills; and (5) communication skills and general knowledge.

The Perinatal Data Collection (PDC) includes records for all children born at ≥20 weeks gestation or weighing ≥400 g in NSW public or private hospitals, as well as planned home births. It includes demographic variables, and information on maternal health, the pregnancy, labour, birth and perinatal outcomes. The Registry of Births, Death and Marriages (RBDM) compiles birth and death registrations for NSW. Birth registrations include date of birth and Aboriginal status as reported by the mother. Death registrations include date of birth, date of death, age of death and year of death registration. The National Death Index (NDI) compiles death registrations from all states and territories of Australia.

The NSW Register of Congenital Conditions (RoCC) includes records for children born with a congenital condition as well as details of the congenital condition identified during pregnancy, at birth or during the first year of life and the date of diagnosis. Identifying information is removed from this data collection after 5 years; therefore, only 5 years of data are available for linkage at any time. At present, data for pregnancy outcomes (ie, date of termination of pregnancy, birth or still birth) recorded between 2006 and 2010 are available. For this reason, only linked records for children in the study population who started school, or potentially started school, in 2012 will be requested.

The NSW Admitted Patient Data Collection (APDC) includes records of all public and private hospital separations (discharges, transfers and deaths) in NSW. These data are available from 1 July 2000. It includes patient demographics, diagnoses and procedures coded according to the Australian Modification of the International Statistical Classification of Diseases and Related Problems, 10th Revision (ICD-10-AM). All available records for children in the study population will be obtained. In addition, all records for the parents of the children in the study population will be obtained for the 5 years prior to the birth of the child (where available).

The NSW Emergency Department Data Collection (EDDC) includes records of all presentations to metropolitan emergency departments (EDs), and the majority of regional EDs, in NSW in 2005–2012. It includes patient demographics, mode of arrival, triage category, mode of separation, diagnoses and procedures coded according to ICD-10-AM49 or Systematized Nomenclature of Medicine, Clinical Terms (SNOMED CT). All available records for children in the study population who presented to EDs on or after 1 January 2005 will be obtained.

The NSW Mental Health Ambulatory Data Collection (MH-AMB) includes data on the assessment, treatment, rehabilitation or care of non-admitted patients. It includes records of specialist mental health services operated by NSW Health. The types of mental health services captured include: mental health day programmes, psychiatric outpatients and outreach services (eg, home visits); hospital-based consultation-liaison services to admitted patients in non-psychiatric and hospital emergency settings; same-day admitted non-procedural care; care provided by community workers to admitted patients and clients in staffed community residential settings; and mental health promotion and prevention services. Data are available from January 2000; however, there was significant undercounting of ambulatory mental healthcare contacts until 2005/2006. The MH-AMB data collection includes patient demographics, diagnosis codes coded according to the ICD-10-AM49 and other characteristics of the service provided (eg, location, service type) for each ‘contact’ between a clinician and a patient. All available records for children in the study population will be obtained. In addition, all records for the parents of the children in the study population will be obtained for the 5 years prior to the birth of the child (where available).

The Key Information Directory System (KiDS) is the NSW FACS’ electronic system for keeping records of selected clients, which was introduced during 2003. It includes records of all child protection contacts with FACS, including information about whether a child has: (1) been assessed by a child protection caseworker as being at actual harm/risk of harm; (2) had a legal decision made in relation to them (eg, court orders); (3) been placed in out-of-home care (including type of care and number of placements); (4) been referred to and participated in a FACS early intervention programme (eg, Brighter Futures). All records for children in the study population will be obtained from their date of birth until the end of their first year of school.

In addition to the data sources described thus far, it is our intention to request linked data from the Medicare Benefits Schedule (MBS) and Centrelink income assistance data, although some of the details of this request remain under negotiation with the relevant government agencies. The MBS data include records for claims for medical and diagnostic services. These records include the MBS item number, date of service, provider charge, schedule fee, benefit paid, payment method and scrambled provider number. All MBS records will be requested for children in the study population from birth until the end of their first year of school.

The Australian Government Department of Human Services, on behalf of the Department of Social Services, records data on the receipt of Australian Government payments administered via Centrelink. All records for a defined set of income assistance payments (most likely Family Tax Benefit Part A) that were provided to the carers of the children in the study population from the child's birth date through to the end of their first year of school will be requested.

Finally, data collected by the AMIHS during the study period, aggregated at the site level, will be obtained. This includes the following information for each AMIHS site: numbers of pregnant women seen, numbers of pregnant women who received antenatal care before 20 weeks gestation, numbers of low birthweight children, numbers of Aboriginal pregnant women who smoked, used other drugs or alcohol during pregnancy and numbers of Aboriginal pregnant women who initiated breast feeding and were breast feeding 6 weeks after birth.

Study design and population

This is a retrospective cohort study based on an almost complete population of children who started their first year of full-time school in the state of NSW, Australia, in 2009 and 2012. The study population will include all children who started school, and have an AEDI record, in NSW in 2009 or 2012 (henceforth referred to as the ‘NSW school starter population’), which is estimated to be almost 181 500 children.12 ,13 Of these children, almost 9000 are Aboriginal (4.5% and 5.3% of the NSW school starter population in 2009 and 2012, respectively).12 ,13 Within the NSW school starter population, children who were born in NSW will be ascertained via linkage to the PDC and the RBDM birth registrations (figure 2). Since these children will have data available from birth to school age, they will be the focus of the majority of analyses for this study (henceforth referred to as the ‘study cohort’).

Figure 2

The inflow and outflow of children in NSW from birth to school age, including the potential school starter and the school starter populations for 2009 and 2012, and the main study cohort. AEDI, Australian Early Development Index; NSW, New South Wales; 1, defined by the date of birth range from the 2009 and 2012 NSW AEDI data; 2, unable to ascertain numbers from linked data sources included in this study.

To define and contextualise the study cohort in relation to the whole population, we will first identify children who were born in NSW (ascertained from the PDC and RBDM) who could potentially have started school in 2009 or 2012 (henceforth referred to as the ‘potential school starter population’). The criteria for potential school starters will be a date of birth that lies within the date of birth ranges in the AEDI data for children who started school in NSW in 2009 and 2012. If a potential school starter does not have an AEDI record completed in an NSW school in 2009 or 2012, it is possible that they: (1) started school in the year before or after the AEDI data collection year, particularly if their birth date lies at the upper or lower ends of the range; (2) started school in another Australian state where their teacher may or may not have completed the AEDI on their behalf; (3) migrated overseas prior to starting school; or (4) may have died during early childhood. From the linked data in our study, we will ascertain the majority of children who started school in another state via linkage to the national AEDI data collection. We will ascertain the number of children from the potential school starter population who died during early childhood from the RBDM and the NDI. However, we will not be able to ascertain the number of potential school starters who started school in years other than 2009 or 2012, started school interstate but have no AEDI record, or migrated or died overseas during early childhood.

Children in the NSW school starter population who do not link to a birth record in the NSW PDC or RBDM will include those (1) born interstate or in another country (which can be ascertained from the ‘Place of Birth’ variable in the AEDI data collection); or (2) born in NSW but had linkage errors or no birth record in the PDC or RBDM. Of note, residential address was not recorded in the 2009 AEDI data, which is problematic for linkage purposes when there is more than one child with the same name attending the same school. However, residential address was recorded in the 2012 AEDI data, which is likely to improve AEDI and PDC/RBDM linkages.

In summary, the total study population includes the previously defined potential school starter population (for 2009 and 2012) and the NSW school starter population in 2009 and 2012 (figure 2), with the study cohort referring to children who have data available from birth to school age. Among the study population, siblings—defined as children born to the same mother—will be identified via linkage to the PDC (mother and baby records) and the RBDM birth registrations. We will not request identification of siblings who are not already part of the study population. Parents of children in the study population will also be identified via linkage to the PDC (mother and baby records) and the RBDM birth registrations.

Data linkage

The Centre for Health Record Linkage (CHeReL) will conduct the first linkage for this study, which will involve identification of the study population through linkage of the AEDI to the PDC and the RBDM birth registrations, followed by identification of records that link to the study population (or their parents) from the AEDI, PDC, RBDM, RoCC, APDC, EDDC, MH-AMB and KiDS data collections. The Australian Institute of Health and Welfare (AIHW) Data Integration Services Centre will conduct the second linkage for this study. This will involve linking the study population (identified during the first linkage) to the MBS, Centrelink and the NDI data collections.

For both linkages, probabilistic matching methods will be used to link records using identifiers (including name, date of birth, sex and address) and the separation principle will be applied. The CHeReL applies the separation process by data custodians supplying identifying data to linkage units and (de-identified) content data directly to researchers. This has previously been described as the ‘best practice protocol’ for preserving individual privacy.50 Quality assurance data show false-positive and false-negative rates of 0.04% and <0.05%, respectively, in NSW.51 In contrast, the AIHW conducts ‘linkage’ and ‘merging’ as two separate and distinct operations within the one agency, applying the separation principle in an alternative method endorsed by the Commonwealth.52

Analysis plan: research questions 1–2


The outcome variable for these analyses is early childhood development in the child's first year of full-time school, as measured by the AEDI. AEDI scores range from 0 (low ability) to 10 (high ability) for each of five early childhood development domains: (1) physical health and well-being; (2) social competence; (3) emotional maturity; (4) language and cognitive skills; and (5) communication skills and general knowledge. These scores will be analysed as continuous outcomes, with appropriate checking of model assumptions because the data are negatively skewed. AEDI scores will also be transformed into dichotomised outcomes based on whether the child's score on each domain qualifies as developmentally ‘on track’ (ie, above the 25th centile) or not, or ‘vulnerable’ (ie, below the 10th centile) or not. Aggregate dichotomised outcome variables will also be constructed to compare children who were developmentally on track on all five domains, or developmentally vulnerable on any of the five domains, with all other children. Where there are differences in domain outcomes between comparison groups (eg, Aboriginal and non-Aboriginal children, or children exposed to a programme or not), we will explore whether differences in specific AEDI subdomain variables underlie differences at the domain level.

Explanatory variables

Individual-level explanatory variables will include:

Demographic and socioeconomic factors: age, sex, Aboriginality, language/s spoken at home, country of birth, mobility indicator (derived from change of address recorded in MBS data), carer receipt of income assistance payments and private health insurance status.

Parental characteristics: maternal age at child's birth, maternal age at first birth, marital status, country of birth and Aboriginality; paternal Aboriginality; and parental health (including mental health) and history of drug and alcohol problems for the 5 years prior to the child's birth (where available).

Health status: congenital conditions within the first year of life and morbidity indicators. Morbidity indicators will be constructed from APDC and EDDC diagnoses for important childhood conditions that may impact on development, such as age of first ventilation tube insertion (to drain middle ear fluid for children with otitis media), number of ventilation tube insertion procedures and number of condition-specific hospital admissions (eg, asthma, injury, gastroenteritis).

Early childhood learning indicators: regular non-parental care and/or participation in other educational programmes or playgroup before entering school, active engagement of caregivers in the school and reading encouraged at home.

Health service use indicators: number of GP visits; number of ED presentations; number of hospital admissions; usual provider continuity index (proportion of visits to most frequently seen GP provider); continuity of care index (derived from number of different GP providers seen and number of visits to each).53 Annual averages will be categorised for analysis (eg, high, medium and low frequencies).

Child protection information: whether or not a child has been assessed by FACS as being at actual harm/at risk of harm, whether or not a child has lived in out-of-home care and whether a child and their family has participated in the Brighter Futures Program.

Area-level explanatory variables will include: accessibility and remoteness, as measured by the Accessibility/Remoteness Index of Australia Plus (ARIA+);54 socioeconomic disadvantage, as measured by the Australian Bureau of Statistics (ABS) Socioeconomic Indexes for Areas (SEIFA);55 presence of Aboriginal Medical Services; presence of an AMIHS; proportion of Aboriginal pregnancies/births in an area managed by an AMIHS; numbers of Aboriginal and non-Aboriginal children attending preschool; numbers of full-time equivalent health workers (including general medical practitioners, nurses, midwives and Aboriginal health workers) per 10 000 population; measures of social capital from the NSW Population Health Survey;56 features of local communities (derived from ABS Census data), such as information on median personal and household income, mortgage repayment and rent; average number of persons per bedroom and household size; employment; non-school qualifications and housing type for Aboriginal residents in each area.57

Statistical analysis

Separate multilevel models will be developed for each of our outcomes, using an iterative process. Models will mostly have two levels: the individual child and the area where they live. We will build the models in two stages. First, the variation in the outcomes will be partitioned to determine how much can be attributed to the individual child or the area where they live;58 ,59 linked information about siblings will be used to explore an additional third family level in the models. Second, we will enter explanatory variables into the models to determine which factors have the most influence on outcomes and inequalities in these outcomes between Aboriginal and non-Aboriginal children.

Statistical analysis will be performed using SAS V.9.3 and MLwiN V.2.6. Continuous and dichotomous outcomes for the five AEDI domains and the aggregate AEDI measure will be modelled separately using multilevel linear and logistic regression, respectively. For each outcome, we will build models separately for Aboriginal children and for all children, in which the Aboriginal to non-Aboriginal outcome ratio/difference will represent inequality in developmental outcomes. In each multilevel model, we will allow the outcome to vary by geographic area (random intercept) and, in the all children model, the Aboriginal to non-Aboriginal ratio to vary (random slope), enabling us to identify areas that are performing better or worse in terms of early childhood developmental outcomes, and areas with greater or lesser inequality in these outcomes. For siblings, we will estimate both child-level and family-level variation. We will explore whether the between-family variation needs to be estimated separately for families with and without siblings, or whether it can be generalised across all families.59

Analysis plan: research questions 3–4

The following analyses are designed to deal with two common scenarios in the assessment of ‘real-world’ programmes. The first is when a programme is rolled out in a non-random fashion on an area basis and only area-level data on service provision are available (AMIHS). The second is when a programme is rolled out in a non-random fashion and data are collected on individuals who used the programme, but not those who did not participate (Brighter Futures). We identified these programmes as potential targets for investigation through consultation with various agencies, including NSW Kids and Families and FACS.

Analyses to assess the impact of the AMIHS

An AMIHS consists of a community midwife and Aboriginal health worker team who provide community-based services to pregnant Aboriginal women in conjunction with existing medical, midwifery, paediatric and child and family health staff. AMIHSs were established in specific areas in a non-randomised manner. In 2001, AMIHSs were established in seven locations and provided services in 24 Local Government Areas (LGAs). Factors that influenced the choice of location were the annual numbers of Aboriginal births and availability of existing services. In 2007, the AMIHS was expanded to an extra 17 locations that offered services in an additional 46 LGAs. Of interest is whether: (1) pregnancy and birth outcomes for Aboriginal mothers and their babies, and (2) early childhood developmental outcomes for Aboriginal children, were better in areas where an AMIHS was established compared with similar areas.

For the purposes of our analyses, mothers and their children will be allocated to the intervention group on an intention-to-treat basis whereby those living in an area with an AMIHS at the time of the birth will be considered ‘exposed’. We intend to identify ‘unexposed’ comparison areas using propensity score matching methods.60 These methods aim to reduce the impact of confounding on the study outcome when the exposure has not been randomly allocated. We will use logistic regression to model the area characteristics that are associated with having an AMIHS. Propensity-matched ‘unexposed’ areas will be those that have a similar propensity to receive an AMIHS, based on their sociodemographic characteristics, but did not receive the service.43 We will use area-level characteristics as described for research questions 1 and 2 (eg, socioeconomic indicators, accessibility, proportion of Aboriginal residents), plus others that are specifically relevant to perinatal services (eg, level of local maternity services, birth rates, birth outcomes prior to the AMIHS). The statistical power will be constrained by the number of areas with an AMIHS and the number of births to Aboriginal mothers in these areas. However, this real-world evaluation will use whole-of-population data for the two birth/school starter cohorts in our study population.

The primary outcome of interest is early development in the first year of full-time school, as measured by the AEDI. For example, whether a child was developmentally ‘on track’ or not, or ‘vulnerable’ or not, on each AEDI domain and on any domain, as described in the methods for objectives 1 and 2. Secondary outcomes of interest include pregnancy and birth outcomes for Aboriginal mothers and babies in the study cohort, including: numbers of pregnant Aboriginal women who had their first antenatal visit before 20 weeks gestation; number of pregnant Aboriginal women who were smoking during the second half of their pregnancy; numbers of Aboriginal infants who were born preterm (less than 37 weeks gestation), with a low birth weight (less than 2500 g), small for gestational age and large for gestational age.

We will use two-level multilevel linear and logistic regression models (mothers and babies nested within areas) to compare outcomes between individuals living in an AMIHS area compared with individuals who live in a propensity-matched comparison area, using an intention-to-treat approach.

Analyses to assess the impact of the Brighter Futures Program

Brighter Futures is a voluntary, targeted early intervention programme for families with children, or who are expecting a child, that aims to prevent vulnerable children and families from entering the child protection system through provision of intervention and support that will achieve long-term benefits for the children.46 The programme provides a range of tailored services including case management, casework focused on parent vulnerabilities, structured home visiting, quality children's services, parenting programmes and brokerage funds. Families may enter the programme following a report via the Child Protection Helpline, or referral from an AMIHS or partner agency.

Families eligible for the Brighter Futures Program require, on average, 12 months of sustained case management for the delivery of coordinated services and support to meet individual and family needs. To be eligible for the program, families must have at least one child under the age of 9, or be expecting a child, and have at least one of the following parental vulnerabilities which impact adversely on their capacity to parent and/or the child's safety and well-being; domestic violence, drug or alcohol misuse, parental mental health issues, lack of parenting skills or inadequate supervision, parent(s) with significant learning difficulties or intellectual disability.

The programme has been progressively rolled out since 2003/2004. Data on individuals who participated in the programme from July 2007—December 2009 were included in the Brighter Futures evaluation undertaken by the Social Policy Research Centre at the University of New South Wales. The programme was initially delivered by FACS and partner agencies in NSW. In January 2012, key programme changes included the delivery of the programme by 16 non-government agencies, streamlined referral pathways and refocusing the programme to target families with children (0–8 years of age) at high risk of entering the statutory child protection system. These decisions are consistent with outcomes of the Brighter Futures evaluation and subsequent data analysis by FACS, which indicate that Brighter Futures can improve the safety of children in high-risk families with complex needs.

Of interest is whether children who participated in the Brighter Futures Program have better early developmental outcomes at school entry than a comparison group of children who did not participate in the Brighter Futures Program.

On the basis of the 2009–2011 Brighter Futures Program participation data,61 we estimate that 4350–8700 children in the study population have participated in the programme during the study period, of whom 1150–2400 (∼27%) were Aboriginal (assuming equal numbers of children in 1 year age groups and a maximum of 2 years programme participation). We will use two comparison groups for this analysis: (1) children recorded in the KiDS database as eligible for the Brighter Futures Program who did not participate; and (2) a propensity-matched comparison group from the whole study population, matched on individual-level characteristics similar to those outlined in the methods to address objectives 1 and 2.

The analysis outcome will be continuous and dichotomised early development outcomes (eg, developmentally ‘on track’ or not, or ‘vulnerable’ or not) for each AEDI domain and on any domain. We will use two-level multilevel logistic regression models (children within areas) to compare early development outcomes for those children who entered the programme with the comparison groups, controlling for area characteristics.

We will also explore the possibility of comparing children who have been exposed to both the Brighter Futures and AMIHS programmes with children who have only been exposed to one programme or none. Programmes are typically evaluated in isolation, but this is not how they are experienced in real life by families. For example, one of the three referral pathways for the Brighter Futures Program is via the AMIHS, so a substantial group of children have been exposed to both programmes from the prenatal period. This continuation of support via involvement in sequential programmes throughout early childhood is likely to lead to better developmental outcomes, particularly for vulnerable and disadvantaged children.

Statistical power

The study includes whole-of-population data for the two birth/school starter cohorts, as defined previously. On the basis of published AEDI data,12 ,13 the study population will comprise approximately 181 500 children, of whom almost 9000 will be Aboriginal. Analysis of NSW hospital data suggests that there will be approximately 9500 families with siblings in the 2009 and 2012 AEDI cohorts (including 430 Aboriginal families), and published birth data indicate that there will be approximately 2500 families with twins (including 75 Aboriginal families). Among Aboriginal children, we estimate: 4300–4850 children will score as developmentally on track and 900–1350 children will score as developmentally vulnerable on each of the five AEDI domains. Around 2800 (39%) Aboriginal children will score as developmentally vulnerable on one or more domains. For multilevel models, the numbers of units at higher levels (ie, area and families) are also important. We will have a minimum of 12 000 families and 151 areas in our models.

Using publicly available community-level AEDI data,62 ,63 we ran a two-level multilevel logistic regression model for one aggregate developmental outcome measure (ie, risk of developmental vulnerability; figure 3A) and an example simulation (figure 3B) using a total sample of 181 500, with the proportion of Aboriginal children in each LGA derived from ABS estimates.64 ,65 Binomial outcome data were simulated assuming a baseline risk of being vulnerable of 21% and a community-level random effect based on the actual variation in the published data (figure 3A). The empirical power calculations from these simulations (1000 replications for each data point) indicate that we will have 80% power to detect an OR of 1.075 and 90% power to detect an OR of 1.09 (figure 3B) equivalent to a risk of being vulnerable of 22.6% and 22.9%, respectively, among Aboriginal children. Power will be similar for other outcomes, such as the odds of being developmentally on track.

Figure 3

(A) Variation in the proportion of developmentally vulnerable children by Local Government Area in 2009 and 2012, and (B) empirical power from simulations for ORs between 1.00 and 1.15.

Ethics and dissemination

To enhance the translation of the project's findings into policy and practice, a reference group will be convened comprising policy stakeholders and organisations involved in providing healthcare, community services, welfare and education to Aboriginal and non-Aboriginal children in NSW. During the final stages of the project, we will also hold a policy forum to promote academic, professional and public discussion on policy and practice issues arising from the project. Outputs from the project will include scientific papers, summary reports in formats designed for policy audiences and presentations at conferences, collaborator meetings and reference group meetings.


Promoting positive early childhood development in Aboriginal children is a key priority for Australia, and yet we have little information about the factors that drive it, and little evidence regarding what programmes and services are effective. The Seeding Success Study will provide evidence that addresses these deficits. Capitalising on recent developments in the availability of linked data relating to child development, health and community services, our study will bring together comprehensive population data relating to social and demographic factors, perinatal factors, health in early childhood, use of early childhood services and child development at school entry for Aboriginal children in NSW. Our study's use of whole-of-population data is one of its greatest strengths. It maximises statistical power while minimising selection bias associated with traditional cohort studies. The large size of the study population (including ∼7800 Aboriginal children) has the additional benefit of increasing the ‘visibility’ of the early life experiences of Aboriginal children. In this way, our study is a useful adjunct to smaller scale epidemiological studies that recruit relatively small numbers of Aboriginal children and may lack the power to detect important relationships between exposures and outcomes in this group. Moreover, the use of whole-of-population data in a large and geographically diverse setting, where Aboriginal children are distributed across metropolitan, regional and remote locations, will enable us to explore geographic variation in positive early childhood health and development outcomes in Aboriginal children, as well as area-level factors that might explain this variation.

An additional strength of this study is its application of quasi-experimental observational methodologies to linked population data to assess the impact of two ‘real-world’ programmes/services, which has important implications for policy and practice. Through well-developed mechanisms for stakeholder engagement and transfer to policy and practice, we will work in partnership with communities, Aboriginal organisations and policymakers to inform the development, targeting and evaluation of programmes and services that will work to ‘seed success’ and maximise Aboriginal children's health, well-being and potential.

Although the use of routinely collected population data has a number of advantages, there are also limitations. An important consideration in this study is that Aboriginal people are known to be underenumerated in the NSW hospital data66 ,67 and death registration data.68 It is also known that the misclassification errors are not randomly distributed and that recording of Aboriginal status has improved over time in the hospital data.66 ,67 ,69 It is likely that Aboriginal people are under-recorded in the other linked data sources in this study too. Our study design aims to minimise the impact of this limitation on our findings. First, we will capitalise on the linkage of data from multiple sectors and agencies that will enable us to obtain multiple indicators of Aboriginal status. This has previously been shown to improve the enumeration of Aboriginal people.68 ,70 We will investigate the application of different Aboriginal identification algorithms using multiple data sources,68–71 balanced against the likelihood of introducing differential misclassification bias related to the number and recency of contact with health services. Second, we will use a cohort approach in our study, which means that under-recording of Aboriginal status will affect the numerator and the denominator, as opposed to the numerator only, which is the case when population rates are estimated using census data as the denominator. In this way, the impact of the under-recording of Aboriginal status on study findings will be minimised.


The authors would like to thank the Australian Government Department of Education, the NSW Ministry of Health, the NSW Register of Births, Deaths and Marriages and the NSW Department of Family and Community Services (FACS) for allowing access to the data. They also thank the Australian Institute of Health and Welfare (AIHW), Medicare Australia and the Australian Government Departments of Health, Human Services and Social Services for discussing access to data, as well as the NSW Centre for Health Record Linkage and the AIHW Data Integration Services Centre for conducting the linkage of records. The authors acknowledge contributions from Caitlin McDowell (FACS) relating to the Brighter Futures component of the research.


View Abstract


  • Collaborators The Seeding Success Investigator team comprises LJ, KF, SE, JL, EB, MB, RC, KE, DR, Sharon Goldfeld, Alastair Leyland, Elizabeth Best and Marilyn Chilvers.

  • Contributors KF and LJ had overall responsibility for the conception of this study with scientific input from the chief investigators. DR contributed expertise in the development of the statistical analysis plan for the study. KF led the writing of this paper. All authors contributed to the design of the study, the writing of this paper and approved the final draft.

  • Funding This work was supported by the National Health and Medical Research Council (NHMRC) (grant number 1061713). KF was supported by an NHMRC Early Career Fellowship (#1016475) and an NHMRC capacity building grant (#573122). EB was supported by an NHMRC Senior Research Fellowship (#1042717). SE was supported by an NHMRC Career Development Fellowship (# 1013418). JL was supported by an Australia Fellowship from the NHMRC (#570120). MB was supported by the Manitoba Center for Health Policy Population-Based Child Health Research Award. KE was funded by an NHMRC Early Career Fellowship (#634533).

  • Competing interests None declared.

  • Ethics approval NSW Population and Health Services Research Ethics Committee; The Aboriginal Health and Medical Research Council of NSW Ethics Committee; and the University of Western Sydney and Australian National University Human Research Ethics Committees.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement This study will use linked records from the data sets detailed in the study protocol, for children who started school in New South Wales (NSW), Australia, in 2009 and 2012, as well as those children born in NSW who might have started school in 2009 or 2012. These data are available to researchers on request and subject to approval from the relevant data custodians and ethics committees, and via linkage conducted by the NSW Centre for Health Record Linkage ( or the Australian Institute of Health and Welfare Data Integration Services Centre (

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.