Article Text

Cohort profile
Chronic non-communicable diseases: Hainan prospective cohort study
  1. Xingbo Gu1,
  2. Liuting Lin2,3,
  3. Chanjuan Zhao1,
  4. Ling Wu1,
  5. Yumei Liu2,
  6. Limin He2,
  7. Guotian Lin2,
  8. Yingzi Lin2,
  9. Fan Zhang2
  1. 1Department of Biostatistics, International School of Public Health and One Health, Hainan Medical University, Haikou, Hainan, China
  2. 2Laboratory of Tropical Environment and Health, International School of Public Health and One Health, Hainan Medical University, Haikou, Hainan, China
  3. 3Department of Science and Education, Hainan Cancer Hospital, Haikou, Hainan, China
  1. Correspondence to Professor Fan Zhang; zhangfan{at}; Yingzi Lin; linyingzi{at}


Purpose The Hainan Cohort was established to investigate the incidence, morbidity and mortality of non-communicable diseases and their risk factors in the community population.

Participants The baseline investigation of the Hainan Cohort study was initiated in five main areas of Hainan, China, from June 2018 to October 2020. A multistage cluster random-sampling method was used to obtain samples from the general population. Baseline assessments included a questionnaire survey, physical examination, blood and urine sample collection, and laboratory measurements, and outdoor environmental data were obtained.

Findings to data A total of 14 443 participants aged 35–74 years were recruited at baseline, with a participation rate of 90.1%. The mean age of the participants was 48.8 years; 51.8% were men, and 83.7% had a secondary school or higher education. The crude prevalence of diabetes, coronary heart disease, stroke, hypertension, hyperuricaemia, chronic bronchitis, pulmonary tuberculosis, asthma, cancer, chronic hepatitis and metabolic syndrome were 8.6%, 9.2%, 2.0%, 37.1%, 7.1%, 2.3%, 1.4%, 2.1%, 4.1%, 2.2% and 14.5%, respectively.

Future plans The Hainan Cohort is a dynamic cohort with no end date. All participants will be monitored annually for cause-specific mortality and morbidity until death. Long-term follow-up will be conducted every 5 years. The baseline population is considered to expand in the next wave of follow-up, depending on the availability of funding support.


Data availability statement

No data are available.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • The Hainan Cohort study is the first independent biospecimen-based prospective study on non-communicable diseases (NCDs) in tropical China.

  • The sample size was relatively large, which gives us sufficient power to explore the relationship between risk factors and common chronic NCDs.

  • The cohort database linked the medical insurance system, the mortality surveillance system and the national basic public health service platform and will combined these data with those obtained from long-term follow-up surveys to ensure that obtain accurate endpoint information.

  • The baseline survey did not collect the data of eligible people who did not participate in the survey, which may lead to selection bias.

  • More than 98% of the participants in the cohort were Han Chinese, caution should be used when the results were applied to other populations.


Chronic non-communicable diseases (NCDs), such as cardiovascular diseases, cancer, diabetes and chronic respiratory diseases, are the most important public health problems globally.1–3 The main characteristics of NCDs are that they are common, long-lasting, and cannot be or are rarely cured completely.4 According to the WHO, NCDs are the leading cause of death worldwide, accounting for 71% of all deaths annually.5 In recent years, with the rapid development of China’s economy and continuous improvements in people’s living standards, traditional living environments, diets and lifestyles have changed, leading to significant changes in the main risk factors for NCDs in the Chinese population.6 Although substantial progress in reducing the burden of NCDs in China has been made, the prevalence of NCDs continues to increase; moreover, the burdens imposed by NCD subcategories have variably persisted in different provinces of China.7–9

Hainan is a tropical island in the southernmost part of China and is China’s second largest island after Taiwan.10 The unique geographical location of Hainan means the climate characteristics are different from those of inland areas in China.11 As China’s only tropical province, most of the island has a humid tropical monsoon climate characterised by hot and humid summers with mild, pleasant winters.12 The annual average temperature of Hainan usually ranges from 24°C to 35°C in the summer and ftom 19°C to 25°C in the winter.13 Such a climate could contribute to the proliferation and spread of micro-organisms, such as Epstein-Barr virus.14 Most Hainan residents consume a light diet, with fewer oils, a sweeter taste and less seasoning than diets on the Chinese mainland.15 Seafood, coffee and tea are the most common items in the Hainanese diet.16 A previous study reported that the prevalence of NCDs in Haikou (capital city of Hainan) was significantly different from those in other parts of mainland China.17 This may be caused by different environmental exposures or lifestyles between Hainan residents and the mainland Chinese population. Therefore, analysing the influence of Hainan’s climate, dietary habits and lifestyle on NCDs is of important guiding significance for the health of Hainan residents. In addition, inherited factors may play a critical role in the progression of NCDs in Hainan residents. Although genome-wide association studies have led to successful identification of a large number of common gene variants associated with NCDs,18–22 these findings may not be appropriate for the Hainan population, whose physical characteristics are different from those of most of the mainland Chinese population. Furthermore, studies that provide clear explanations of the effects of potential interactions between genetic and unique environmental risk factors on NCDs are scarce.23 24 To date, there is no independent biospecimen-based prospective study on NCDs in Hainan. Aetiological descriptions and prevention and control measures for NCDs in mainland Chinese and Western populations cannot be generalised to the Hainan population. There is an urgent need to establish a large, localised population-based cohort; this not only has important practical significance for the health of people in Hainan but also will provide a scientific reference for neighbouring island countries and regions, such as Southeast Asia.

From June 2018 to November 2020, we conducted a large population-based prospective cohort study in Hainan, with an appropriate study design and rigorous data quality control. The main study objectives of the Hainan Cohort were to (1) establish a large population-based cohort database and biobank; (2) explore the prevalence and incidence of and risk factors for common NCDs; (3) clarify the relationships among environmental factors, lifestyle, psychosocial factors and NCDs, and potential effect modifications among these factors; (4) assess the effects of genetic factors and gene–environment interactions on NCDs; (5) identify potential biomarkers of NCDs using metabolomic analysis; and (6) establish a tracking system to monitor the occurrence, progression and prognosis of common NCDs, and all-cause death; and7 develop and validate a health risk assessment and risk prediction model for NCDs.

Cohort description

Selection of study site and participant eligibility

The Hainan Cohort study was designed as a longitudinal study that recruited community dwellers in five main areas of Hainan (figure 1): Haikou (the capital city located on the northern coast of Hainan), Sanya (central city in the south of Hainan), Qionghai (a city located in the east of Hainan), Dongfang (a city located on the western coast of Hainan) and Qiongzhong (Li and Miao Autonomous County in central Hainan). These areas cover both urban and rural locations, as well as different ethnic groups who speak different dialects, and are representative of the diverse sociodemographic characteristics of Hainan. We determined the sample size of each area based on the population distribution of Hainan and other objective factors, such as biological sample preservation, transportation and local coordination. Then, a multistage cluster random-sampling method was used to obtain samples from the general population. In the first stage, 18 districts were selected from five major areas through simple cluster sampling. Second, communities or units in each district were randomly selected considering population stability, local medical conditions and adherence to future follow-up assessments. The inclusion criteria were as follows: (1) age between 35 years and 74 years old; (2) residence in Hainan for at least 5 years; (3) no severe physical disability, mental illness or severe medical condition; and (4) ability to complete questionnaire interviews, physical examinations and follow-up assessments. All the participants enrolled in this study voluntarily and signed an informed consent form on enrolment. A unique identification was created for each participant using their national identification card.

Figure 1

Locations of the five survey sites in the Hainan Cohort.

Patient and public involvement

Patients and/or the public were not involved in the design, conduct, reporting or dissemination plans of this research.

Study plan

The Hainan Cohort is a dynamic cohort with no end date, and we will collect follow-up data for each participant periodically until death. Follow-up will be conducted using two methods: monitoring and long-term follow-up surveys. For basic monitoring of the participants, we are cooperating with the Social Insurance Service Centre and Healthcare Security Administration of Hainan. Using the independent medical insurance number of each urban and rural participant who signed the informed consent form, information about medical treatment, we will collect hospitalisation and incidence of diseases from the department of medical insurance data management in January and July of each year. The Mortality Surveillance System of the Hainan Centre for Disease Control and Prevention allows us to track cause-specific mortality in all the participants in the cohort. Causes of death will be classified according to the 10th version of the International Statistical Classification of Diseases and Related Health Problems. The National Basic Public Health Service Platform ( has achieved full coverage in Hainan, enabling us to ascertain the onset time of common chronic diseases, such as hypertension and diabetes, in the cohort.

Long-term follow-up will be conducted every 5 years. All surviving participants will be visited and will complete an in-person assessment, which is the same as that conducted at baseline, that includes a questionnaire interview, physical examination, and blood and urine sample collection. The onset of endpoint events and the loss of cohort participants will be investigated and compared with the monitoring data to reduce possible omissions and mis-statements caused by conventional monitoring. Endpoint events for major chronic diseases will be confirmed by experienced physicians based on well-accepted international standards. The household registration system of the public security department has the relocation records of all permanent residents, which can be combined with the national network medical insurance system to follow up any study participants who can still be contacted after relocation and finally control the rate of lost to less than an estimated 8%. Figure 2 shows the overall study plan.

Figure 2

Study plan of the Hainan Cohort study. NCD, non-communicable disease.

Baseline assessments

In December 2017, a pilot study was carried out before the formal investigation. The aim of the pilot study was to evaluate whether each item on the questionnaire was easy to understand by persons in different age groups and from different cultural backgrounds and whether the options were logical. Additionally, the pilot study allowed us to find the most suitable method for asking difficult questions after discussion to facilitate the training of investigators in the future. The baseline assessments were performed from June 2018 to October 2020 and included a questionnaire survey, physical examination, blood and urine sample collection, laboratory measurements and outdoor environmental data collection.

After overnight fasting, all participants were required to bring their ID cards or household registration books to the local health examination centre at the specified time. Following the obtainment of written informed consent, a validated face-to-face questionnaire interview was conducted by well-trained interviewers from the School of Public Health, Hainan Medical University. The questionnaire collected general demographic (name, sex, age, ethnicity, education level, marital status, occupation and annual household income), medical history (diabetes mellitus, coronary heart disease, stroke, hypertension, hyperuricaemia, cancer, etc), family disease history (coronary heart disease, hypertension, stroke, diabetes mellitus, cancer and chronic obstructive pulmonary disease), lifestyle habit (smoking, alcohol consumption, tea consumption, dietary pattern, sleeping status, physical activity, passive smoking and indoor pollution), female reproductive history and mental status data. Table 1 shows detailed information of the baseline measures.

Table 1

Summary of baseline measures in the Hainan Cohort study

Physical examinations were conducted by trained doctors or nurses from the First Affiliated Hospital of Hainan Medical University; collected measurements included height, waist circumference, hip circumference, blood pressure, heart rate, lung function, body composition (weight, body mass index (BMI), body fat percentage, skeletal muscle mass, bone mass, daily energy requirement, etc), as well as 12-lead electrocardiography. All measurements were collected by trained investigators in a large, bright room with a suitable temperature using standard procedures. Height was measured without shoes and hats using an ultrasonic height measuring instrument (SK-X80, China). Waist and hip circumferences were measured while the participants were wearing thin clothing. The investigator stood on the right side of the participant and measured them with a tape measure (Hoechstmass, China). The above three measurements were accurate to the nearest 0.1 cm. Blood pressure and heart rate were measured twice consecutively using a validated electronic sphygmomanometer (HEM-7071; OMRON, Japan) on the right arm after the participants had rested for at least 5 min in a sitting position. Forced expiratory volume in 1 s and forced vital capacity were measured using a spirometer (SP10; Contec, China); the participants took a deep breath and blew at the fastest speed and with maximum strength until all the gas in their lungs was exhaled. Body composition was measured using a body fat detector (BC-6017611; Tanita, Japan). Participants removed their coats, shoes and socks and checked that the soles of their feet were clean; when necessary, they used a wet towel to wipe their feet, and they removed all metal objects (such as mobile phones and keys) to avoid affecting the result. Weight was measured to the nearest 0.1 kg. Electrocardiography was performed (ECG-1350C; NEC Lighting, China), and the experienced physician reviewed each ECG. BMI was calculated as weight (kg) divided by height (m) squared. Table 2 shows the details of the equipment for the physical examination and clinical measurements.

Table 2

Summary of anthropometric and clinical measurements collected at baseline in the Hainan Cohort study

Before blood collection, all participants fasted for at least 10 hours as required. To prevent the occurrence of hypoglycaemic events, all blood samples were collected between 07:00 and 10:00. Three tubes of peripheral venous blood from each participant were extracted for a total of 15 mL; two tubes were EDTA anticoagulation tubes for the separation of plasma and blood cells, and one tube was a non-anticoagulation tube for the separation of serum and clotted blood. The collected blood samples were sent to the laboratory for testing within 2 hours. The serum samples were used to detect biochemical indicators with automatic biochemical analyser (7600–110; Hitachi, Japan), including alanine aminotransferase, aspartate aminotransferase, albumin, alkaline phosphatase, gamma-glutamyl transpeptidase, triglycerides (TGs), total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), urea, creatinine, uric acid (UA), fasting plasma glucose (FPG) and glycated haemoglobin. Routine blood parameters, including white blood cell count, red blood cell count, red cell distribution width, platelet count, etc, were measured using an automatic blood cell analyser (BC-6900; Mindray, China). The remaining blood samples were stored in a 4°C refrigerator and centrifuged within 1 day after collection. Plasma, serum, and white and red cells were collected after 15 min of centrifugation at 3000 rpm. Aliquots of 0.2 mL of centrifuged blood samples were retained in each tube and stored in a −80°C refrigerator to prevent DNA breakage or degradation. At least 50 mL of midstream morning urine on an empty stomach was collected using a urine cup and immediately placed into a refrigerator at −4°C after collection. After testing, the remaining urine samples were divided into two 15 mL centrifuge tubes using disposable plastic straws (10–15 mL each) and sent back to the laboratory for preservation within 24 hours. The routine urine test indicators included urine glucose, urine protein, urobilirubin, etc. All these test items were measured by the general standardised methods of the Clinical Laboratory of the Affiliated Hospital of Hainan Medical University.

Outdoor environmental data, including daily maximum temperature, daily minimum temperature and average temperature, were collected from the National Meteorological Bureau and local meteorological departments. Air quality monitoring data at the survey site, including the daily air quality index, air quality levels and primary pollutant (SO2, NO2, CO, O3, PM10 and PM2.5) concentrations, were obtained through the air quality platform of the National and Hainan Environmental Protection Departments.

Hypertension was defined as a sitting systolic blood pressure (average of 2 readings with a difference of <5 mm Hg) of 140 mm Hg or diastolic blood pressure of 90 mm

Hg and/or the use of antihypertensive drugs.25 Diabetes mellitus was defined as a fasting blood glucose value of ≥7.0 mmol/L, prior diagnosed diabetes mellitus or the use of antidiabetic agents.26 Metabolic syndrome (MS) was defined according to the criteria of the International Diabetes Federation (IDF) for Chinese adults. MS was defined according to the IDF criteria and the Chinese specific abdominal obesity standard in adults (waist circumference in Chinese men ≥90 cm, waist circumference in Chinese women ≥80 cm), along with the presence of two or more of the following: (1) TG level of >150 mg/dL (1.7 mmol/L) or drug treatment for elevated TGs, (2) HDL level of <40 mg/dL (0.9 mmol/L) in men, < 50 mg/dL (1.1 mmol/L) in women or drug treatment for low HDL; (3) systolic blood pressure of ≥130 mm Hg or diastolic blood pressure of ≥85 mm Hg or drug treatment for hypertension, and (4) FPG of ≥100 mg/dL (5.6 mmol/L) during fasting status or diagnosed diabetes.27

In addition, a resurvey was conducted in a random subsample of the Hainan Cohort during the time of the baseline survey. Approximately 5% of participants were randomly chosen and completed the resurvey. The procedures and content of the resurvey were consistent with those of the baseline survey.

Finding to date

Of the 16 070 eligible people, 14 602 participants were enrolled and underwent at least one major measurement at baseline, with a participation rate of 90.1%. The most common reasons for declining participation in the study were no interest in or lack of understanding of the study, no time to participate in the examinations and fear of exsanguination. Of those who completed the survey, 159 were excluded because they were younger than 35 or older than 74 years at baseline. Finally, a total of 14 443 participants fulfilled the inclusion criteria of this study. The baseline characteristics of the participants are presented in table 3. Among the participants, 51.8% (n=7484) were men; the mean age was 48.8 (SD 9.6) years; and less than 15% of people were over the age of 60. A total of 83.7% of the participants had a secondary school education or above, and the illiteracy rate was only 5.2%. The proportion of married participants was 89.7% for both sexes. Divorced, widowed and single participants accounted for 3.3%, 3.5% and 3.6%, respectively. The most commonly reported annual household income level was between ¥30 000 and ¥60 000 (33.3%). Among men, 45.8% were current smokers, and 7.8% were former smokers. Among women, 98.9% of participants had never smoked. Nevertheless, the percentage of women suffering from passive smoke was higher than that of men (29.1 vs 16.0%). Regarding alcohol consumption, 30.7% of men and 6.4% of women were current alcohol consumers, and 64.9% of men and 90.5% of women were never alcohol consumers. More than half of the participants consumed soup (41.4% consumed it sometimes, 14.9% consumed it every day). A total of 5.3% of the participants never ate breakfast, and 3.7% of participants ate midnight snacks every day. Regarding regular exercise, the prevalence in men was slightly higher than that in women (29.1% of men and 26.6% of women reported engaging in regular exercise). The mean sleep duration, BMI, waist and hip circumferences, systolic blood pressure and diastolic blood pressure, heart rate, body fat rate, skeletal muscle and bone mass, body moisture rate and visceral fat index were 7.2 hours, 23.7 kg/m2, 82.1 and 94.1 cm, 130.2 and 82.1 mm Hg, 79.7 beats/min, 26.8%, 43.8 and 2.7 kg, 54.9% and 8.4, respectively.

Table 3

Baseline characteristics of the participants in the Hainan Cohort study

Table 4 shows the biochemical characteristics of the participants at baseline. The mean TG, TC, HDL-C, LDL-C, FPG, creatinine and UA levels were 1.6 mmol/L, 5.5 mmol/L, 1.9 mmol/L, 5.7 mmol/L, 69.4 µmol/L and 325.8 µmol/L, respectively. Information about other hepatic function and routine blood results can be found in table 4.

Table 4

Baseline levels of biochemical traits of the participants in the Hainan Cohort study

The baseline prevalence of common chronic NCDs is presented in figure 3. The prevalence rates of diabetes, coronary heart disease, stroke, hypertension, hyperuricaemia, chronic bronchitis, pulmonary tuberculosis, asthma, cancer, chronic hepatitis and MS were 8.6%, 9.2%, 2.0%, 37.1%, 7.1%, 2.3%, 1.4%, 2.1%, 4.1%, 2.2% and 14.5%, respectively. Men had higher rates of diabetes, coronary heart disease, hypertension, hyperuricaemia, chronic bronchitis, pulmonary tuberculosis, asthma, chronic hepatitis and MS than women, while women had higher rates of stroke and cancer.

Figure 3

Prevalence of common chronic NCDS at baseline in the Hainan Cohort study. (A) Total participants, (B) stratified by sex. CHD, coronary heart disease; NCD, non-communicable disease.

A total of 637 (4.4%) participants completed the resurvey. Table 5 shows the correlation of some of the variables between the baseline survey and the resurvey. For most of the selected measurements, the resurvey was in good concordance with the baseline survey. Among these measurements, the correlation coefficient of routine blood indexes was the highest, at approximately 0.95. The correlation coefficient between physical examination, and blood biochemical examination results ranged between approximately 0.6 and 0.9.

Table 5

Spearman correlation coefficients for selected measurements between the baseline survey and resurvey among 637 participants

Strengths and limitations

This cohort study has several clear strengths. First, as far as we know, the Hainan Cohort study is the first independent biospecimen-based prospective study on NCDs in tropical China to analyse detailed information including sociodemographic characteristics, lifestyle habits, personal and family histories of diseases, dietary pattern, sleeping status, physical activity levels, female reproductive history, and anthropometric and clinical measurements. Second, the sample size was relatively large, which gives us sufficient power to explore the relationship between risk factors and common chronic NCDs, such as diabetes mellitus, chronic obstructive pulmonary disease, myocardial infarction and stroke (see sample size estimation in online supplemental materials 1). Third, the sample biospecimens collected at baseline will allow us to conduct nested case–control studies on genetic variations and metabolomic biomarkers for NCDs. Fourth, due to rigorous data quality control, the age distribution and sex ratio of the baseline data can generally be representative of those in the Chinese population (based on the seventh Chinese National Census). Finally, we linked the cohort database with the medical insurance system, the mortality surveillance system and the national basic public health service platform and will combine these data with those obtained from long-term follow-up surveys to ensure that we obtain accurate endpoint information.

This study also has limitations. First, we did not collect the basic data of eligible people who did not participate in the survey according to the sampling plan, which may lead to selection bias. Second, for some NCDs with a low incidence rate (such as lung cancer), the sample size of the study is insufficient. We will consider expanding the number of participants in the next wave of follow-up. Third, although we carefully checked the collected questionnaires and the resurvey showed good concordance with the baseline survey, we could not rule out possible recall and measurement bias during data collection. Finally, more than 98% of the participants in the cohort were Han Chinese; thus, caution should be taken when generalising the results of this work to other ethnicities.

Data availability statement

No data are available.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and was approved by the human research ethics committee of Hainan Medical University on 21 February 2017 (IRB number 2017-04). The participants gave informed consent to participate in the study before taking part.


We thank Sun Yat-sen University for its strong support for this project and all the research participants and staff of Hainan Key Novel Thinktank "Hainan Medical University 'One Health' Research Ceter", the First Affiliated Hospital of Hainan Medical University and the Second Affiliated Hospital of Hainan Medical University. We thank all participants for their contributions.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • XG and LL contributed equally.

  • Collaborators The study data are currently not freely available; however, the research team welcomes all potential collaborations with other researchers. For further information, please email the corresponding authors (FZ ( or YZL (

  • Contributors XG and LL conceptualised the study and contributed to the drafting of the manuscript and are cofirst authors. YL and FZ contributed to the study design and manuscript revision and are corresponding authors. CZ, LW, YL, LH and GL contributed to the data collection and data analysis and are coauthors. FZ responsible for the overall content as guarantor.

  • Funding This work was supported by grants from the National Key R&D Program of China (number 2017YFC0907104) and the National Natural Science Foundation of China (number 81860577).

  • Map disclaimer The inclusion of any map (including the depiction of any boundaries therein) or of any geographical or locational reference does not imply the expression of any opinion whatsoever on the part of BMJ concerning the legal status of any country, territory, jurisdiction or area or of its authorities. Any such expression remains solely that of the relevant source and is not endorsed by BMJ. Maps are provided without any warranty of any kind, either express or implied.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, conduct, reporting or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.