Article Text

Download PDFPDF

Classification of tic disorders based on functional MRI by machine learning: a study protocol
  1. Fang Wang,
  2. Fang Wen,
  3. Jingran Liu,
  4. Junjuan Yan,
  5. Liping Yu,
  6. Ying Li,
  7. Yonghua Cui
  1. Department of Psychiatry, Beijing Children's Hospital, Beijing, China
  1. Correspondence to Professor Yonghua Cui; cuiyonghua{at}; Dr Ying Li; liying{at}


Introduction Tic disorder (TD) is a common neurodevelopmental disorder in children, and it can be categorised into three subtypes: provisional tic disorder (PTD), chronic motor or vocal TD (CMT or CVT), and Tourette syndrome (TS). An early diagnostic classification among these subtypes is not possible based on a new-onset tic symptom. Machine learning tools have been widely used for early diagnostic classification based on functional MRI (fMRI). However, few machine learning models have been built for the diagnostic classification of patients with TD. Therefore, in the present study, we will provide a study protocol that uses the machine learning model to make early classifications of the three different types of TD.

Methods and analysis We planned to recruit 200 children aged 6–9 years with new-onset tic symptoms and 100 age-matched and sex-matched healthy controls under resting-state MRI scanning. Based on the neuroimaging data of resting-state fMRI, the support vector machine (SVM) model will be built. We planned to construct an SVM model based on functional connectivity for the early diagnosis classification of TD subtypes (including PTD, CMT/CVT, TS).

Ethics and dissemination This study was approved by the ethics committee of Beijing Children’s Hospital. The trial results will be submitted to peer-reviewed journals for publication.

Trial registration number ChiCTR2000033257.

  • Child & adolescent psychiatry
  • Protocols & guidelines

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • Machine learning will be used to make early classifications of the three different types of tic disorder (TD) rather than differentiating patients with TD from healthy controls.

  • The generalisability of different machine learning algorithms, including linear support vector machine (SVM), SVM recursive feature elimination and multiple kernel learning, in making early diagnoses will be explored.

  • Correlations between changes in functional connectivity and premonitory urges and the severity of tic symptoms will be analysed in 200 children aged 6–9 years with new-onset tic symptoms.

  • Only the functional MRI data are included in the SVM model, and multimodal MRI data will be included in further studies.

  • Only children aged 6–9 years were included in the analysis, and more different age groups were needed.


Tic disorder (TD) is a common neurodevelopmental disorder in children. It has been reported that 0.2%–46.3% of children experience tics in their lifetime.1 TD can be categorised into three subtypes based on the symptoms (motor tics or vocal tics) and the duration of symptom persistence. The subtypes include provisional TD (PTD), chronic motor or vocal TD, and Tourette syndrome (TS).2 Notably, different subtypes of TD might require different treatments.3 Therefore, early diagnostic classification has been regarded as one of the most important clinical issues that need to be addressed.4

However, the current classification of TD can only be diagnosed through retrospective interviews based on the symptoms and duration.5 Therefore, it is now impossible to judge the classification6 and prognosis7 of TD based on a new-onset tic symptom. In other words, there is now no way to make an early diagnostic classification when tic symptoms first appear.

The development of MRI provides new insight into the pathophysiology of TD. The sensory discomfort of TD is related to abnormalities in white matter in the orbitofrontal cortex.8 In addition, the core features of TD seem to be the abnormal cortex-striatal-thalamic-cortical circuit (CSTC).9 The subtypes of TD and TS show structural abnormalities in the supplementary motor area, somatosensory cortex, premotor cortex, limbic system and basal ganglia.10 11 Furthermore, patients with TS show abnormal functional connectivity (FC) between the striatum, thalamus, primary motor cortex and sensory cortex.12 Therefore, neuroimaging abnormalities could be promising biomarkers for the diagnosis of TD.13

Recently, radiomics has shown great potential in aiding clinical diagnosis by analysing medical images.14 In radiomics analysis, features are extracted from high-throughput image data, and machine learning methods can be used to analyse these features to build different models to support clinical decision-making.15 Radiomics had already been used in helping the diagnosis of psychiatric diseases.16 17 Indeed, several studies have begun to use machine learning in the diagnosis of TD.18 19 One study used support vector machine (SVM) based on functional MRI (fMRI) data to identify children with TS among healthy controls with an accuracy of 74%.20 Another study also reported an SVM model for identifying patients with TS from healthy controls with high accuracy and sensitivity (accuracy=92.86%, sensitivity=91.67%).21 SVM is one of the most commonly used models for the diagnosis of TD, and it will also be ‘suitable’ for the classification of different subtypes of TD.

Therefore, this protocol aims to use SVM to identify patients with PTD, chronic TD and TS based on the fMRI data. It would be useful for clinicians to use a predictive model to make an early diagnosis of the three different types of TD.

Methods and analysis

This is a single-centre study that will be carried out at the Department of Psychiatry in Beijing Children’s Hospital (China). The participants will be enrolled consecutively. The inclusion criteria and exclusion criteria will be developed to identify individuals who meet these criteria. This project was approved and monitored by the Ethics Committee of Beijing Children’s Hospital. Written informed consent will be obtained from the participant and/or their guardian before they are included in this study. This study is supported by the Special Fund of the Pediatric Medical Coordinated Development Center of Beijing Hospitals Authority (No. XTYB201802).

Recruitment of participants

The recruitment period will last between 1 June 2020 and 30 June 2021. All the participants will be recruited in the Department of Psychiatry of Beijing Children’s Hospital of Capital Medical University. Outpatients who show new-onset tic symptoms will be recruited to participate in this study. Flyer of this study is posted in the waiting room in the Department of Psychiatry in Beijing Children’s Hospital. The clinician in the department will introduce the study to patients who fulfilled the requirements of this study. The healthy controls will be recruited from a local primary school with long-term cooperation. Informed consent will be obtained from the children and/or their legal guardians.

Inclusion criteria and exclusion criteria for included participants

The inclusion criteria of patients with TD are as follows: (1) aged 6–9 years; (2) Chinese Han nationality; (3) tic symptoms appearing during the past 3 months; (4) fulfilled the diagnosis of TD according to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5); (5) never used any drugs to ease tic symptoms; (6) left-handed and (7) IQ higher than 80.

The exclusion criteria of patients with TD are as follows: (1) a comorbid diagnosis of schizophrenia, autism spectrum disorder, bipolar disorder based on screening of the Schedule for Affective Disorders and Schizophrenia for School-Age Children-Present and Lifetime version (Kiddie-SADS-PL)22; (2) tic symptoms longer than 1 year; (3) epilepsy and other neurological diseases or serious physical condition; and (4) antipsychotic use during the past 3 months for treatment.

For the healthy control group, the inclusion criteria are as follows: (1) matched with the tic patients in age, gender and IQ; and (2) ruled out the diagnosis of any mental disorder based on the screening of Kiddie-SADS-PL. The exclusion criteria were (1) any psychiatric or neurological disease and (2) having taken antipsychotics during the past 3 months for treatment.

There is no consensus about the methods to calculate the sample size needed in machine learning. However, based on previous studies, the machine learning model showed great accuracy in differentiating patients with mental disorders from healthy controls with sample sizes between 150 and 200.23 24 Furthermore, we summarise some previous studies based on SVM to differentiate patients with TS from healthy controls in table 1. Considering that there might be some drop-off during the follow-up study, we decided to recruit 200 children with tic symptoms in our study. In addition, we plan to enrol 100 patients in this study.

Table 1

Examples of the application of SVM based on the radiomics in TS

Neuroimaging data collection

All participants will be scanned using an MR750 3.0T scanner (General Electric Company) at the Department of Radiology in Beijing Children’s Hospital. First, a three-dimensional magnetisation prepared rapid gradient echo sequence will be used to acquire T1-weighted images, and its parameters are listed: TR=2530 ms, TE=2.34 ms, TI=1100 ms, resolution=256×256, flip angle=7, slices=176 and slice thickness=1 mm. The T1 sequence lasted for 649 s. After T1-weighted image acquisition, all participants will also undergo T2-weighted image scans that lasted for 60 s to exclude incidental central nervous system diseases. We will acquire resting-state fMRI (rs-fMRI) scans by T2-weighted echo planar imaging with the following parameters: TR=2500 ms, TE=21 ms, slices=42, flip angle=90, slice thickness=35, field of view=200 mm, resolution=64×64, voxel size=3.1×3.1×3.5 mm and bandwidth=2520 Hz/Px.

Neuroimaging data preprocessing

The rs-fMRI data will be preprocessed using the Data Processing & Analysis for Brain Imaging ( It will be preprocessed as in a previous study,21 25 and the detailed steps are listed below:

  1. The first 10 images were removed to reduce the potential noise.

  2. Correction for temporal differences: as each slice acquired at different times, all slices were corrected for time offsets so that every TR was the same in the acquisition time.

  3. Correction for head movement: co-registration of each TR and each slice, and parameters for the head position include translational and rotational displacements along the X, Y, and Z axes, neither translation nor rotation parameters in any given data set exceeded ±3 mm or ±3°.

  4. Exclude the subject who had head movement over the 3 mm and 3° in max.

  5. Segment T1 images into grey matter, white matter and cerebrospinal fluid.

  6. Co-registration of the T1 images to functional images.

  7. Space normalisation: the T1 image data, and functional data will be normalised to Montreal Neurological Institute (MNI) space using first linear and then non-linear transformation. After being normalised to the MNI template, the functional images will then be resampled to 3×3×3 mm3 resolution.

  8. Space smoothing: the image data are smoothed with 4×4×4 mm3 full widths at half maximum to reduce the individual difference after normalisation.

  9. Correct for linear trends: 0.01–0.08 Hz temporally bandpass filtering to reduce high-frequency physiological noise and low-frequency noise using temporal filtering.

Machine learning classification

Machine learning enables a computer to recognise feature data without being explicitly programmed. The key part of machine learning lies in pattern recognition. The main process includes (1) extracting and selecting feature classification, (2) constructing a predictive model and (3) assessing the generalisation ability of the predictive model.

After the preprocessing of neuroimaging data, we plan to use principal component analysis to analyse the rs-fMRI data. After this step, dozens of features of different subtypes of TD can be extracted. These features will then be used for machine learning classification. As listed in table 1, SVM is a widely used classification method used in the study of neuropsychiatric diseases.26 It is an algorithm used to recognise patterns. SVM has constructed space with maximum margin, in which different data classes can be separated. SVM classification is an algorithm with high accuracy and generalisation ability. In our study, we mainly use a machine learning algorithm based on the SVM algorithm. Both the linear SVM and SVM recursive feature elimination (SVM-RFE) will be used. In addition to the SVM algorithm, we also plan to use the multiple kernel learning (MKL) algorithm. The accuracy, sensitivity and specificity will be calculated as the evaluation parameter of the SVM model.

In this study, we use 10-fold cross-validation to assess the generalisability.18 The data set of all the participants will be randomly split into 10 parts; nine parts of the data set will be chosen as training data used as an SVM model, and the remaining part will serve as a test data set to test the classification ability of this model. Specifically, for all 200 participants, after preprocessing the neuroimaging data, their rs-MRI data will give a new sample ID from 1 to 300. Then, 30 random numbers are generated. These 30 participants are designated the validation set, while the remaining 270 participants served as the training set. The functional network properties of these 270 participants are used to train the SVM model. The feature will be selected using only data from the training set. Then, the validation set is used to evaluate the performance of the SVM model. By repeating these procedures 10 times, the model with ‘optimal’ results was chosen, and the average accuracy is calculated.27 The procedure of the whole study is listed in figure 1.

Figure 1

The procedure for building the SVM model. ALFF, altered amplitude of low-frequency fluctuation; CTD, chronic tic disorder; FC, functional connectivity; PTD, provisional tic disorder; ReHo, regional homogeneity; rs-fMRI, resting-state functional MRI; SVM, support vector machine; TS, Tourette syndrome.

Clinical measurements

The questionnaire includes demographic data such as age and gender. It also includes items about the structure and function of the family, such as the occupation of parents, their education level, age, the relationship between husband and wife, type of family education, etc. Perinatal data include parental age during pregnancy, and whether parents smoked, drank alcohol, went through physical disease, and took medication during pregnancy. Growth and development data will be recorded, such as language and motor development, and history of organic diseases such as brain tumours, brain trauma, convulsions, carbon monoxide poisoning and night terrors. These are all factors that might be associated with the onset of TD.28–30

Tic symptom severity will be rated by the widely used Yale Global Tic Severity Scale (YGTSS).31 YGTSS is a clinician-rated scale to measure the severity of motor and vocal tic symptoms. Tic symptoms will be evaluated by observation of the children and interviewing children and their parents about the number, frequency, intensity, complexity and interference of tic symptoms. For every domain mentioned above, a 0–5 score is used, with 0 indicating there is no tic and 5 indicating the most severe condition in the domain. The Chinese version was translated by Youquan Zhong et al and was proven to have good reliability and validity.32

The Premonitory Urges for Tics Scale (PUTS) is a self-reported scale to assess premonitory urge; it contains nine sentences to describe unpleasant sensory phenomena during the tic. It is a 4-point scale, where 1 stands for ‘not at all true’, while 4 stands for ‘very much true’.33 The total PUTS score showed a strong association with tic severity.34 Our group worked with the author of PUTS and translated it into Chinese, and we were working on testing the reliability and validity of the PUTS-Chinese version. PUTS was used to assess premonitory urges in this study.

The Clinical Global Impressions Scale (CGI) is a clinician-rated scale to evaluate the global symptom severity of mental disorders. It is a subjective evaluation based on a clinician’s experience of the disease. CGI uses 7-point measures to assess the severity of illness if the patient ‘normal, not at all ill’ is rated 1, and if he or she is considered ‘extremely ill,’ 7 is chosen.35 In this study, CGI was used as a supplement to YGTSS to evaluate tic symptom severity.

Children and their parents will complete a demographic data format and the self-reported PUTS together. Clinicians on the research team perform YGTSS and CGI. The clinicians (n=3) in this study were trained to finish the evaluation of the YGTSS, PUTS and CGI. The intraclass correlation coefficient (ICC) is good (ICC ≥0.85). The clinical assessments will be performed at four time points (baseline, 3 months, 6 months and 12 months). The measurements at every time point are listed in table 2.

Table 2

Measurements at each follow-up time point

Statistical analysis

We plan to use independent-sample t-tests and Χ2 tests to analyse the demographic and clinical data. The age and IQ difference between the TD group and the healthy control group will be compared using independent-sample t-tests. The Χ2 test will be used to compare the sex ratio between the two groups. Similar to previous studies, we will use the two-sample t-test to compare the rs-fMRI data between the TD group and the healthy control group. For significant changes in the FC network in the TD group, Pearson’s correlation will be performed to calculate their correlation with the YGTSS and PUTS scores. We plan to use SVM-RFE to extract features from the processed rs-fMRI.

Primary and secondary outcomes

Primary outcome

The main purpose of our study was to evaluate the performance of a machine learning model in the classification of TD. Therefore, the primary outcomes will be listed as the accuracy, sensitivity, specificity and area under the curve of the prediction model based on the SVM, SVM-RFE and MKL algorithms.

Secondary outcomes

The FC difference between patients with TD and healthy controls will be reported as the secondary outcome. The correlation between significant FC changes and clinical severity will also be explored.

Patient and public involvement

Depending on the patients’ priorities, experience and preferences, the development of the research question and outcome measures will be explained by analysing fMRI with machine learning, which will be used to make a classification diagnosis of TD by the time the tic symptom first appeared. During the study design phase, patients will not be involved, and during the recruitment and implementation phase of the study, the concerns and questions of patients will be addressed by clinical doctors. The trial results will be submitted to peer-reviewed journals for publication.

Ethics and dissemination

This study was approved by the ethics committee of Beijing Children’s Hospital. The trial results will be submitted to peer-reviewed journals for publication.


Machine learning is an important field in artificial intelligence. It has been widely used in analysing the neuroimaging data of psychiatric patients and shows great potential in the diagnosis, prognosis prediction and treatment outcome prediction of psychiatric disease.26 SVM is the most common method used in machine learning.36 SVM enabled computers to discover the patterns of neuroimaging data in a supervised way.26 It can be applied to analyse a variety of neuroimaging data, such as rs-fMRI data,37 diffusion tensor imaging data18 or diffusion-weighted imaging data,38 and shows high accuracy in differentiating psychiatric patients from healthy controls. In the present study, the SVM was used to build a predictive model to classify the three types of TD based on fMRI data. If this model shows good accuracy for the diagnoses of different types of disorders, it will be beneficial for clinicians to make better clinical choices for the treatment of TD.

According to the current diagnostic criteria (such as DSM-5), any new-onset tics were diagnosed as PTD. However, no biomarker predicts the prognosis of PTD, specifically whether it persists or not. This finding suggested that some symptoms include the presence of subsyndromal autism spectrum symptoms and anxiety disorder, and reward suppression of tic symptoms may be associated with the prognosis of PTD.39 Therefore, many ‘risk factors’ might influence the development of tic symptoms. Among these factors, premonitory urges (PUs) seem to be the most important40 because they positively correlate with the severity of tic symptoms.41–43 Future studies should pay more attention to the association between PU and the development of tic symptoms.

Furthermore, further studies are required to investigate how these ‘important risk factors’ influence the development of tic symptoms. New techniques such as structural or functional MRI might provide new insight into these research mechanisms because TD is a neurodevelopmental disease with structural abnormalities in the supplementary motor area, somatosensory cortex, premotor cortex, limbic system, basal ganglia and network connectivity of the CSTC.10 11 Therefore, these brain structural and functional abnormalities may serve as biomarkers in the diagnosis of different types of TD. Several studies use SVM to differentiate patients with TS from healthy controls based on fMRI data, which indicates that SVM might be the most suitable algorithm to analyse the rs-fMRI data and build the predictive model. No study has explored the application of SVM in the classification of different subtypes of TD. This study is the first to explore the accuracy of SVM in the early diagnostic classification of TD.

Although several studies use SVM to differentiate patients with TS from healthy controls,20 21 our protocol deserves to be considered for the following reasons. First, this protocol describes the detailed procedures to build a predictive model by SVM, such as how to acquire and preprocess MRI data. Second, our study has a larger sample size than other SVM classification studies. Third, we outline the steps in constructing a functional network and SVM classification, which is easy for other research teams to replicate our procedure and validate our model in other data. Last, we recruited only drug-naive children with PTD and ruled out medication confounders.

Two limitations exist in this study. First, tic symptoms might be suppressed when there were other people around, and videotape with children alone in the room is a more objective way to observe tic expression.44 Second, we only have rs-fMRI data in the predictive model, and multimodal MRI data should be considered in future studies.

Ethics statements

Patient consent for publication


Thank you very much to all authors involved in this study.



  • YL and YC contributed equally.

  • Correction notice This article has been corrected since it first published. 'Ying Li' has been added as the corresponding author in the article.

  • Contributors For this manuscript, YC took the initiative. FWen, LY, JY and JL will participate in the data collection. YL and FWang finished the draft. All authors have read and approved the manuscript.

  • Funding This study is supported by the Special Fund of the Pediatric Medical Coordinated Development Center of Beijing Hospitals Authority (No. XTYB201802). YC was the founder.

  • Disclaimer The funding body had no further role in the study design, the collection, analysis, and interpretation of data, the writing of the manuscript or the decision to submit the paper for publication. We confirmed that our study protocol has undergone peer review by the funding body.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.