Objective To develop a simplified Finnegan Neonatal Abstinence Scoring System (sFNAS) that will highly correlate with scores ≥8 and ≥12 in infants being assessed with the FNAS.
Design, setting and participants This is a retrospective analysis involving 367 patients admitted to two level IV neonatal intensive care units with a total of 40 294 observations. Inclusion criteria included neonates with gestational age ≥37 0/7 weeks, who are being assessed for neonatal abstinence syndrome (NAS) using the FNAS. Infants with a gestational age <37 weeks were excluded.
Methods A linear regression model based on the original FNAS data from one institution was developed to determine optimal values for each item in the sFNAS. A backward elimination approach was used, removing the items that contributed least to the Pearson’s correlation. The sFNAS was then cross-validated with data from a second institution.
Results Pearson’s correlation between the proposed sFNAS and the FNAS was 0.914. The optimal treatment cut-off values for the sFNAS were 6 and 10 to predict FNAS scores ≥8 and ≥12, respectively. The sensitivity and specificity of these cut-off values to detect FNAS scores ≥8 and ≥12 were 0.888 and 0.883 for a cut-off of 6, and 0.637 and 0.992 for a cut-off of 10, respectively. The sFNAS cross-validation resulted in a Pearson’s correlation of 0.908, sensitivity and specificity of 0.860 and 0.873 for a cut-off of 6, and 0.525 and 0.986 for a cut-off of 10, respectively.
Conclusion The sFNAS has a high statistical correlation with the FNAS, and it is cross-validated for the assessment of infants with NAS. It has excellent specificity and negative predictive value for identifying infants with FNAS scores ≥8 and ≥12.
- Neonatal Abstinence Syndrome
- Finnegan Score
- Opioid Withdrawal
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Strengths and limitations of this study
A simplified scoring system (simplified Finnegan Neonatal Abstinence Scoring System (sFNAS)) is proposed developed from a large data set and cross-validated using a database from another institution.
Cutoff values are provided for the sFNAS which predict the cutoff value.
The retrospective nature of our study requires prospective validation of the inter-rater reliability of the sFNAS and also the clinical validity of this tool.
The incidence of neonatal abstinence syndrome (NAS) has steadily increased since the 1970s and is now a significant public health problem.1–4 Tolia et al 5 reported an almost fourfold increase in NAS neonatal intensive care unit (NICU) admissions from 7 cases/1000 admissions in 2004 to 27 cases/1000 in 2013. The median length of stay for infants with NAS increased from 13 to 19 days. The proportion of infants who received pharmacotherapy also increased, 74% in 2004–2005 to 87% in 2012–2013, resulting in a 35% increase in hospital costs.2
Several scoring systems have been proposed to evaluate infants with NAS.6 These included the Finnegan Neonatal Abstinence Scoring System (FNAS), the Lipsitz tool,7 the Neonatal Withdrawal Inventory,8 the Neonatal Narcotic Withdrawal Index9 and so on. However there is no consensus as to which tool to use, the cut-off points for treatment and the interval between assessments.3 6 10 11
The FNAS has been the most commonly used scale for the last 40 years.3 4 10–12 The FNAS was developed in 19751 13 and consisted of 21 items allowing a thorough assessment of infants with NAS. The tool was analysed for interuser reliability (mean inter-rater reliability coefficient of 0.82 (0.75–0.96)) and validated for the diagnosis of NAS. However, the lack of subsequent validation and inter-rater reliability is a major concern regarding the FNAS.6 After its original publication, the FNAS was slightly modified in its form but not in content.14 15
Finnegan et al have proposed using three consecutive scores of 8 or higher, or two consecutive scores of 12 or higher, to initiate pharmacological treatment.1 13 15 Contiguous scores less than 8 are often used as a measure of readiness for weaning pharmacological therapy.16 The validity of the cut-off point of the FNAS was evaluated in 2010 by Zimmermann et al 17 in term newborns without opiate exposure. They found that 95% of the scores were less than 8, with some variability according to the day of life and time of day (lower at night); this led them to conclude that a value above 8 can be considered pathological. This study also supports the use of consecutive scores to identify infants who require pharmacological intervention.
The FNAS is a lengthy tool,8 18 and given the continued increase in the number of infants diagnosed with NAS our goal was to create a shortened and simplified version of Finnegan Scoring System (sFNAS) that will highly correlate with the original FNAS for an efficient clinical assessment.
This retrospective study was conducted using data from two institutions, both level IV NICUs and regional referral centres for the respective area. The study was approved by the Institutional Review Board of the University of Louisville and University of Kentucky. Data were collected from the electronic medical records at each institution. Inclusion criteria included neonates with gestational age ≥37 0/7 weeks, admission to the NICU for withdrawal after any in utero opioid exposure and assessment with the FNAS for signs of NAS. Infants with a gestational age less than 37 weeks and those with exposure to psychoactive substances without opioid exposure were excluded. At each institution, the nurses who performed the FNAS had training with experienced scorers (video, demonstration by trainer and reverse demonstration by trainee) and testing for reliability (trainer and trainee assess and score same infant) prior to being assigned care of infants with NAS; retraining in scoring with FNAS was done annually.
Preliminary inspection of the frequency of each item on the FNAS and the total score was done. To derive our proposed sFNAS, items contributing least to the Pearson’s correlation between the sFNAS and FNAS were eliminated until the correlation dropped below 0.95 (table 1). Multiple levels within the remaining items were then combined to one or fewer levels or subitems if the resulting impact on the correlation was negligible. This was done to simplify the scoring even further, which is particularly useful when distinguishing different levels of an item. Values assigned to each remaining item were obtained via linear regression and rounded to the nearest integer; the box shows the details of the statistical procedure. We estimated ORs to determine the associations between items. Generalised estimating equations (GEE) with robust SE estimation were used to account for repeated measurements or scores, from the same subject.19 Tests were two-sided at the 0.01 significance level.
In the development of the sFNAS, the original FNAS scores of ≥8 or ≥12 were considered as the pharmacological treatment cut-off values. Therefore, receiver operating characteristic (ROC) curves based on the accuracy of our shortened score to predict scores of ≥8 or ≥12 from the original FNAS were constructed. Optimal cut-off values for our sFNAS were then determined using the maximum proportion correctly classified (PCC). Sensitivities (Sn), specificities (Sp), positive predictive values (PPV) and negative predictive values (NPV) were also obtained, along with corresponding 95% CIs via the use of GEE.
Detailed statistical steps and description in the development of the simplified Finnegan Neonatal Abstinence Scoring System (sFNAS)
Optimal values assigned to each item in our sFNAS were obtained via a linear regression model in which the items were used as predictors of the original FNAS. The full estimated regression model with all items is given by:
(FNAS) ̂=b0+b1(Crying–Excessive high pitched)+b2(Crying–Continuous high pitched)+b3(Sleeps–<3 hours after feeding)+b4(Sleeps–<2 hours after feeding)+b5(Sleeps–<1 hour after feeding)+b6(Moro reflex–Hyperactive)+b7(Moro reflex–Markedly hyperactive)+b8(Tremors–Mild: disturbed)+b9(Tremors–Moderate-severe: disturbed)+b10(Tremors–Mild: undisturbed)+b11(Tremors–Moderate-severe: undisturbed)+b12(Increased muscle tone)+b13(Myoclonic jerks)+b14(Generalised convulsions)+b15(Excoriation)+b16(Sweating)+b17(Fever–<101)+b18(Fever–>101)+b19(Frequent yawn)+b20(Mottling)+b21(Nasal stuffiness)+b22(Sneezing)+b23(Nasal flaring)+b24(Respiratory rate–>60/min)+b25(Respiratory rate–>60/min with retractions)+b26(Excessive sucking)+b27(Poor feeding)+b28(Regurgitation)+b29(Projectile vomiting)+b30(Stools–Loose)+b31(Stools–Watery)
Here, (FNAS) ̂ is the predicted FNAS score and b0,…,b31 are estimated regression parameters, or the optimal values assigned to each level of each item. To develop the new scoring system, a backward elimination approach was used, removing the items that contributed least to the Pearson’s correlation between the shortened score and the original FNAS. Parameter estimates from the linear regression model correspond to optimal values for each item with respect to maximising the coefficient of determination, or R2; that is, the scores assigned to each item maximise the amount of variation in the observed original FNAS that can be explained by the given items. This corresponds to a maximisation of the Pearson’s correlation between the score from FNAS and the score that comprised the items in the regression model with their optimally assigned values.
The steps in our elimination approach are given in table 1. Items were removed until the correlation dropped below 0.95 (step 10). Levels within the remaining items were then combined if the resulting impact on the correlation was negligible (step 11). This was done to simplify the scoring even further, which is particularly useful when it is difficult for scorers to distinguish between different levels of an item. Optimal values for each remaining item were rounded to the nearest integer.
At the end of step 11, the estimated linear regression model is given by:
(FNAS) ̂ = 0.93+2.00(Crying–Excessive or continuous high pitched)+1.43(Sleeps–<2 or 3 hours after feeding)+3.22(Sleeps–<1 hour after feeding)+1.33(Any disturbed tremors)+ 4.72(Any undisturbed tremors)+ 2.17(Increased muscle tone)+1.14(Nasal stuffiness)+1.26(Respiratory rate–>60/min, retraction or no retraction)+0.99(Excessive sucking)+1.99(Poor feeding)+2.11(Feed tolerance–Regurgitation or projectile vomiting)+1.99(Stools–Loose or watery).
In order to simplify scoring with the sFNAS, we rounded the values in the above equation to the nearest integer. We note that b0 is a nuisance parameter, and although it rounds to 1 in the above equation, we do not include this value in the computation of the sFNAS.
To validate our sFNAS, we applied the scoring to the data from institution 2. Specifically, we computed for the Pearson’s correlations and tested classification proportions, and the values of Sn, Sp, PPV, NPV and PCC were obtained. Analyses were conducted in SAS V.9.4.
The data comprised a whole year’s (2014) observations from 185 subjects of institution 1 contributing a total of 27 447 scores, and from 182 babies from institution 2 contributing a total of 12 847 scores. The number of observations per baby in institution 1 ranged from 1 to 605, with a mean (SD) of 148 (116) and 25th, 50th and 75th percentiles of 38, 153 and 210, respectively. The number of observations per baby in institution 2 ranged from 1 to 310, with a mean (SD) of 70 (59) and 25th, 50th and 75th percentiles of 18, 60 and 106, respectively. The variability in the number of scores per infant was due to the inclusion of some infants in the study in the middle or end of their treatment course. Although the number of infants between institutions was similar, the total number of scores differed because in institution 2 scoring was changed to every 6 hours from every 3 hours, when an infant showed improvement in scores following initiation of pharmacotherapy; the goal was to minimise disturbing the infant to promote more sleep and allow ad lib feedings.
Some signs differed in percentages of occurrence between the two institutions. The sign most frequently noted for both institutions was the increased muscle tone. The signs namely tremors, excessive cry, sleep <3 hours, respiratory rate >60/min and loose stools were observed in more than 10% of instances in both institutions. Gastrointestinal manifestations were more frequently observed in institution 2.
The backward elimination approach was applied to the data from institution 1 (table 1). The majority of items that were not retained were observed in less than 10% of instances; therefore, they did not contribute enough to the Pearson’s correlation in order to warrant their inclusion into the sFNAS. Mottling was an item that was present in more than 10% of the observations; however, its elimination did not greatly affect the Pearson’s correlation (table 1). The proposed sFNAS comprised 10 items as presented in table 2. The Pearson’s correlation of our sFNAS with the original FNAS is 0.914. The evaluation of the interrelatedness of the items that were removed and the items that were kept in the sFNAS are given in table 3. All items that were removed were significantly related to the items that were kept in the sFNAS, as demonstrated in table 3. Mottling is the item that was most significantly related to the others, as it had significant associations with 8 out of the 10 items that were retained in the sFNAS. The sFNAS predicted the original FNAS scores ≥8 or ≥12 or higher with a respective area under the ROC curve of 0.952 and 0.982. Based on the PCC, the optimal treatment cut-offs for our shortened scale are 6 and 10, respectively. The PCC for institution 1 for the cut-off values of 6 and 10 was 0.884 and 0.980, respectively, and comparable values of 0.867 and 0.920, respectively, for institution 2. Values of Sn, Sp, PPV, NPV and PCC are given in table 4.
For the statistical validation of the sFNAS, the scoring system applied to the data from institution 2 resulted in a Pearson’s correlation of 0.908 with the original FNAS. Observed values for Sn, Sp, PPV, NPV and PCC from the use of sFNAS are also presented in table 4.
When deriving a new scale, it is likely that some values that are near the cut-off get incorrectly classified. Therefore, we now look at the misclassified cases, that is, corresponding sFNAS scores that were close to the FNAS cut-offs of 8 and 12. There were 20 197 FNAS scores of 0–7 for the original data and 6969 for cross-validation data. Of these observations, 9.8% and 9.6%, respectively, corresponded to sFNAS cut-off score of 6. The total FNAS scores from 0 to 11 for the original and cross-validation data sets, respectively, were 26 497 and 10 940. Only 0.6% and 0.9% were misclassified, respectively, corresponding to sFNAS score of 10, the higher cut-off for sFNAS. Of the FNAS scores ≥8 (original data set: 7250 observations; cross-validation data: 5784), 8.4% and 9.7%, respectively, were misclassified as sFNAS=5, below the cut-off of 6 for sFNAS. As to FNAS scores ≥12 (original observations: 950; cross-validation data: 1813), 196 or 20.6% and 397 or 21.9%, respectively, were misclassified as sFNAS=9, lower than the higher sFNAS cut-off of 10.
We propose a simplified scoring system, the sFNAS, to evaluate infants with NAS. The sFNAS is a shortened FNAS derived by maximising its correlation with the original scale. Our scoring system consisting of 10 items provides a Pearson’s correlation of 0.914 with the original FNAS and a proposed cut-off of 6 and 10 (instead of 8 and 12). Using the proposed cut-off, the simplified scoring system provides excellent Sp and NPV. The cross-validation of sFNAS provided a Pearson’s correlation of 0.908 and detected adequate values of Sn, Sp, PPV and NPV. To our knowledge, this is the first shortened or simplified NAS scoring system that provided a cross-validation process.6
The 21-item original FNAS assesses the central nervous system, autonomic and gastrointestinal signs of infants with NAS. In its development, each of the categories was analysed to include the major clinical signs,13 resulting in a comprehensive scoring system that included clinically significant items that were often highly correlated, and thereby increasing the number of manifestations to be assessed.20 Although the sFNAS is shortened to 10 items, it maintains more than one item within each category of signs of NAS.
The assessment of the central nervous system signs in the sFNAS included four items (cry, tremors, tone and sleep) describing neurological excitability. These items were also described or included in other scoring systems,7 8 13 20 21 with reported prevalence comparable to our results. Moro reflex and myoclonic jerks were not retained in the sFNAS, but were significantly related to items retained in the new scoring system. Seizures are a very important sign of NAS, with a high assigned score value in the original FNAS; however, its occurrence was noted in less than 0.1% of the more than 40 000 observations that we analysed.
As to autonomic manifestations, nasal stuffiness and respiratory rate were the items that contributed most to the FNAS and were included in the sFNAS. The respiratory rate has been reported to be associated with a lack of respiratory control or an abnormal breathing pattern of infants with NAS.22 23 Gewolb et al 22 found subtle abnormalities in respiratory control and swallow rhythmicity in infants exposed to opioids and cocaine. The investigators proposed that in utero drug exposure may affect the neuronal development and organisation in brainstem areas, resulting in abnormalities in the coordination of suck–swallow–respiration. Other autonomic manifestations noted in our infants were frequent yawning, fever, sneezing and sweating; these were rare and/or strongly associated with the manifestations included in the sFNAS.
Of the gastrointestinal manifestations, all four items in the original FNAS were retained in the sFNAS. Prolonged sucking with fewer pauses associated with increased spitting episodes has been reported in infants with NAS,13 20 24 25 but the significance of these signs remains controversial.22 24 Loose or watery stools have been reported to be a sign of excessive gastrointestinal irritability in infants undergoing opioid withdrawal.26 27 The loose or watery stools alone contributed notably to the correlation as an independent and clinically significant sign in NAS.
There were differences in percentages of occurrences of some items between the two institutions in our study. Reports indicate variation in the prevalence of the signs of NAS across different centres.10 15 20 21 28 Since external factors and nurses variability have minimal influence on the FNAS,29 we believe that this variability seems to be more related to intrinsic neonatal factors. Regardless of the difference in percentages of occurrences of some items between our institutions, the Pearson’s correlation was high with adequate Sn and excellent Sp, PPV, NPV and PCC. High Sp and NPV are important for identifying infants that should not require pharmacotherapy.
There were other scales developed concomitantly with the FNAS. The Ostrea tool30 is a six-item scale that ranks multiple NAS signs but with no proposed guide for treatment. The Neonatal Narcotic Withdrawal Index9 was designed in 1981 with adequate validity, but it did not gain much popularity. The Narcotic Withdrawal Score by Lipsitz7 in 1975 included 11 items and evaluated infants twice daily. The tool proposed to initiate treatment when one score is greater than 4. However, studies have shown that a high score may not necessarily be obtained in subsequent assessments. Therefore, consecutive scores meeting cut-off would be pertinent to initiate pharmacological treatment.1 17
There were previous attempts to modify the Finnegan scoring system.27 The Neonatal Withdrawal Inventory8 proposed an 8-point checklist that was derived from the FNAS with reported inter-rater reliability, sensitivity and specificity; however, the scoring system was not validated. Jansson et al 25 developed a new scoring system (MOTHER NAS scale), adding and removing items to the FNAS to create a 19-item scale, recommending treatment on scores ≥9. This score was used in the Maternal Opioid Treatment: Human Experimental Research (MOTHER) project.31 Subsequently, it was modified to develop a short screening tool32; its validation awaits further studies. In 2015, institution 2 started assessing infants with NAS using the MOTHER NAS scale. From a data set of 17 150 observations in 276 infants over a period of 1 year, the sFNAS compared with the MOTHER NAS scale resulted in a Pearson’s correlation of 0.86. The sFNAS showed Sn=0.96, Sp=0.80, PPV=0.55, NPV=0.99 and PCC=0.83 to predict scores ≥9 on the MOTHER NAS scale. This correlation between the sFNAS and the MOTHER NAS scale is another support for the potential utility of the sFNAS in the assessment of NAS.
In 2013, Maguire et al 18 reported on an FNAS-short form. They performed factor analysis and their proposed scale contained seven items with a reported Pearson’s correlation of 0.917 with their original data set. Since this score is very similar to ours, we applied it to our data sets. The correlation between their seven-item score and the original FNAS score was 0.818 and 0.811 for institutions 1 and 2, respectively, which is lower than our correlations of 0.914 and 0.908 for institutions 1 and 2, respectively. This difference may be attributed to having more items included in the sFNAS.
As to limitations, the sFNAS was solely derived to shorten the original FNAS and not to improve on the psychometric properties of the original tool.33 Our study did not include a prospective clinical evaluation of inter-rater reliability, and therefore this will need to be addressed in future studies. Also, our study was not designed to determine the utility of the sFNAS in adjusting doses for pharmacological treatment; this is the next logical step after prospectively establishing its inter-rater reliability, sensitivity and specificity as a tool for initiation of treatment. Future studies should also include determination of the proportion of infants correctly classified as meeting threshold cut-offs from both the sFNAS and the original FNAS, as well as the percentage of those with discrepant scores. Indeed, the lack of uniformity in the assessment of infants with NAS2 5 10 11 28 34 can complicate the clinical care and increase the hospital costs for the affected infants.
The sFNAS provides a shortened and simplified assessment for infants with NAS. Developed with a rigorous statistical approach and with cross-validation, the sFNAS is not only abbreviated, but easily administered with minimal handling or interaction with the infant. The sFNAS having a high correlation with the original FNAS and with the MOTHER NAS scale makes it an attractive, efficient and simple alternative to the use of these lengthy tools. Further studies are needed to establish clinical utility, validity and reliability prior to a widespread application of the sFNAS.
Contributors EGP designed the study, was involved in data collection, results’ interpretation, drafted the initial manuscript, subsequent revisions and approved the final manuscript as submitted. LPF conceptualised the study, was involved in the analysis and interpretation of results, in review of draft and revisions, and approved the final manuscript as submitted. LD collaborated in the design of the study, involved in the data collection, analysis and interpretation of the results, critically reviewed the manuscript, and approved the final manuscript as submitted. VAC participated in the concept and design of the study, was involved in the data collection, critically reviewed the manuscript and approved the final manuscript as submitted. KTI participated in the concept and design of the study, designed the data collection instrument, involved in the data collection, interpretation of results, critically reviewed the manuscript and approved the final manuscript as submitted. HB co-conceptualised the study, collaborated with the design and development of the data collection instrument, was involved in the analysis and interpretation of the results, critically reviewed the manuscript draft and revisions, and approved the final manuscript as submitted. PMW carried out the extensive statistical analysis, was involved in the review and interpretation of results, critically reviewed the manuscript and approved the final manuscript as submitted. All authors approved the final manuscript as submitted and agreed to be accountable for all aspects of the work.
Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient consent Retrospective chart review of discharged patients.
Ethics approval Institutional Review Board of the University of Louisville and University of Kentucky.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement All available data were used in the development of this study and can be obtained by contacting the corresponding author.