Article Text

Download PDFPDF

Serum levels of chemical elements in esophageal squamous cell carcinoma in Anyang, China: a case-control study based on machine learning methods
  1. Tong Lin1,
  2. Tiebing Liu2,
  3. Yucheng Lin1,
  4. Chaoting Zhang3,
  5. Lailai Yan4,
  6. Zhongxue Chen5,
  7. Zhonghu He3,
  8. Jingyu Wang4
  1. 1The Key Laboratory of Machine Perception (Ministry of Education), School of EECS, Peking University, Beijing, China
  2. 2Civil Aviation Medicine Center, Civil Aviation Administration of China, Beijing, China
  3. 3Key laboratory of Carcinogenesis and Translational Research (Ministry of Education), Laboratory of Genetics, Peking University Cancer Hospital & Institute, Beijing, China
  4. 4Center of Medical & Health Analysis, School of Public Health, Peking University, Beijing, China
  5. 5Department of Epidemiology and Biostatistics, School of Public Health, Indiana University Bloomington, Bloomington, Indiana, USA
  1. Correspondence to Zhonghu He; zhonghuhe{at}foxmail.com and Dr Jingyu Wang; wjy{at}bjmu.edu.cn

Abstract

Objectives Esophageal squamous cell carcinoma (ESCC) is the predominant form of esophageal carcinoma with extremely aggressive nature and low survival rate. The risk factors for ESCC in the high-incidence areas of China remain unclear. We used machine learning methods to investigate whether there was an association between the alterations of serum levels of certain chemical elements and ESCC.

Settings Primary healthcare unit in Anyang city, Henan Province of China.

Participants 100 patients with ESCC and 100 healthy controls matched for age, sex and region were included.

Primary and secondary outcome measures Primary outcome was the classification accuracy. Secondary outcome was the p Value of the t-test or rank-sum test.

Methods Both traditional statistical methods of t-test and rank-sum test and fashionable machine learning approaches were employed.

Results Random Forest achieves the best accuracy of 98.38% on the original feature vectors (without dimensionality reduction), and support vector machine outperforms other classifiers by yielding accuracy of 96.56% on embedding spaces (with dimensionality reduction). All six classifiers can achieve accuracies more than 90% based on the single most important element Sr. The other two elements with distinctive difference are S and P, providing accuracies around 80%. More than half of chemical elements were found to be significantly different between patients with ESCC and the controls.

Conclusions These results suggest clear differences between patients with ESCC and controls, implying some potential promising applications in diagnosis, prognosis, pharmacy and nutrition of ESCC. However, the results should be interpreted with caution due to the retrospective design nature, limited sample size and the lack of several potential confounding factors (including obesity, nutritional status, and fruit and vegetable consumption and potential regional carcinogen contacts).

  • Esophageal squamous cell carcinoma
  • chemical elements
  • machine learning

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • The classification accuracies achieved by machine learning methods are remarkably higher than most empirical decisions on this small corpus of 100 patients with esophageal squamous cell carcinoma and 100 healthy comparison subjects; in addition, the test error rates provide the quantitative confidence of the prediction results.

  • This diagnosis procedure is not expensive and can be conducted in a short period of time, making it possible for clinical use.

  • A major limitation of the present work is that the study is retrospective with a relatively small patient cohort. This framework should be evaluated with a larger patient cohort before any real clinical applications are adopted.

Introduction

Oesophageal cancer (EC) is a cancer with extremely aggressive nature and low survival rate; it has been one of the deadliest cancers worldwide. In 2012, an estimate of 455 800 EC cases and 400 200 deaths occurred in the world.1 2 Esophageal squamous cell carcinoma (ESCC) is the predominant form of esophageal carcinoma globally; most patients diagnosed as in advanced stages are not amenable to curative treatment. Major risk factors include poor nutritional status, low intake of fruits and vegetables, smoking, alcohol consumption, hot tea drinking, poor oral health, gastro-oesophageal reflux disease, overweight and obesity.3 4 However, the risk factors for ESCC in the high-incidence areas, such as north-central China, remain unclear.

In addition, most diagnosed patients with ESCC already have had locally advanced EC or distant metastases due to lack of early signs or symptoms.5 Therefore, the diagnosis test that is practical, non-invasive and can be easily performed is of great interest; new methods like F-fluorodeoxyglucose-positron emission tomography (F-FDG PET)have emerged for the initial staging of patients with EC.5 Currently, a large body of research in this area aim to identify new biomarker candidates for cancers, such as prostate cancer,6 breast cancer,7 lung cancer8 and gastrointestinal neoplasia.9

Chemical elements play essential roles in the biological processes. A number of studies have shown that changes of chemical elements levels might be linked to the risk of some cancers,10 11 including EC.12 However, very few relevant studies and only Se, Cu and Zn have been conducted.12–14 In addition, many chemical elements, such as Mo, Ni, PR, Rb, Sb, Sn, Sr, Th, Ti, Tl, U and V, have not been incorporated. The underlying interactions among these chemical elements can be complex; traditional single variable analysis or correlation analysis may lack the capability to have accurate predictions. Recently, machine learning techniques, such as support vector machines (SVM) and feature selection methods, are gaining popularity in this field for handling high-dimensional input features and yielding better diagnostic accuracies.15

In this study, based on recent machine learning techniques and classical statistical methods, a 1:1 matched case–control design was conducted to probe the differences in the serum levels of 38 relatively common chemical elements between patients with ESCC and healthy comparison subjects.

Materials and methods

In the following subsections, we will describe the ESCC serum sample acquisition and preprocessing, evaluation protocol, the main ideas behind dimensionality reduction and classification algorithms, and statistical hypothesis testing. The design and analysis for this study were followed the suggestions from the (strengthening the reporting of observational studies in Epidemiology STROBE  guidelines.16

Sample collection

At the Cancer Hospital of Anyang city, Henan Province of China, 100 patients newly diagnosed with early-stage ESCC were consecutively recruited in 2010. During the same period, 100 age, sex and region-matched healthy comparison subjects were randomly selected from a cohort study17 about ESCC conducted in Anyang city. Demographic data, personal information and blood samples were obtained from the two groups: patients with ESCC and healthy controls. Specifically, only samples at least 1 week prior to the esophagectomy of patients with ESCC were considered in this case. Patients who suffered from ESCC with some other cancers were excluded from the study. Then each blood sample was centrifuged at 3000 rpm in a 15 min endurance, and separated and stored at −20°C. This study was approved by the Institutional Review Board of Peking University School of Oncology, Beijing. Informed consents were obtained from all participants.

In this study, we defined regular cigarette smoking as a history of smoking at least one cigarette per day for 12 months or 18 packs for 1 year, and regular alcohol consumption was defined as drinking Chinese liquor at least twice per week for 12 months (regular consumption of other beverages such as beer or red wine is very rare in this local area).

Elements measurement

Each serum sample was put into a quartz tube and 0.3 mL purified HNO3 (nitric acid) was added. After predigestion at room temperature for 2 hours, 0.5 mL H2O2 was added to promote further digestion. The tubes were then placed in a microwave digestion system (Ultrawave, Milestone, Italy) and diluted to 7 mL with deionized water and then diluted to 15 mL with deionised water before analyses. Concentrations of calcium (Ca), magnesium (Mg), potassium (K), phosphorus (P) and sodium (Na) were determined by inductively coupled plasma-atomic emission spectrometry (ICP-AES, American Thermo Electron Corporation iCAP-6300). Also, the levels of other 33 elements, including iron (Fe), selenium (Se), copper (Cu), zinc (Zn), aluminium (Al), manganese (Mn), arsenic (As), molybdenum (Mo), vanadium (V), chromium (Cr), nickel (Ni), lead (Pb), cadmium (Cd), beryllium (Be), boron (B), titanium (Ti), germanium (Ge), strontium (Sr), lithium (Li), silver (Ag), cadmium (Cd), stannum (Sn), barium (Ba), platinum (Pt), thallium (Tl), bismuth (Bi), caesium (Cs), thorium (Th), uranium (U), lanthanum (La), cerium (Ce), rubidium (Rb) and mercury (Hg) were measured by ICP-MS (American PerkinElmer ELAN DRC Ⅱ). Particularly, the concentrations of six elements are very low (possibly below the detection limit of the spectrometry): Be, Cd, Pt, U, V and Hg. However, we did not directly remove these six ‘nuisance’ low-concentration elements; instead, these elements were retained to serve as noise for testing the robustness of our algorithm.18 19

Quality control

The following issues were considered to ensure the accuracy of the levels of macro and trace elements. (1) All reagents were analytical grade, and water was deionised. (2) All tubes were washed with HNO3 and rinsed with deionised water. (3) Indium was added into each sample as an internal standard before digestion. (4) We replicated all the blood samples and used several Standard Plasma References, Level I(REF 8883) and II(REF 8884) for quality control. (5) All tubes used were made of polypropylene instead of glass materials to prevent metal contamination. (6) The measurement of element levels was based on the most abundant isotope of each element to avoid interference.

Data normalisation

Each sample contains five demographic characteristics (age, gender, smoking history, drinking history and family history on ESCC), together with concentrations of the aforementioned 38 elements. After data acquisition, preprocessing is performed for later use. The first step is digitisation, with gender (male or female), smoking history, drinking history and family history were represented by 0 or 1. The next step is normalisation: mapping all the concentrations of elements into the interval of (0, 1). The data normalisation procedure is used to avoid numerical difficulties during calculations and to prevent that some variables with greater numeric ranges dominate other variables in smaller numeric ranges, which has been important for practical deployments of neural networks (NN), SVMs and other classifiers.20 In the same way, ages are linearly transformed from 0 to 100-year olds into (0, 1). After preprocessing, the data from the 100 patients with ESCC and the 100 healthy controls were summarised as a 200×43 matrix, with each row for one subject. Ground-truth labels are used to stand for the case of ESCC or not by +1 or −1, respectively.

Evaluation protocol

To compare the classification performance among different methods, we use the 10 rounds of fivefold cross validation to obtain the average classification accuracies,21 namely the proportion of correct diagnosis to all of the test subjects. Specifically, the classification accuracy is equal to the ratio between the number of correct decision (true positive and true negative for binary problems) and the total number of test subjects. In each round of fivefold cross validation, the labelled examples of the data are randomly partitioned into five chunks (or called folds) of approximately equal size, and then a classification model trained over any four chunks yields a test accuracy on the remaining chunk. For 10 rounds of fivefold cross validation, the test accuracy of each method is computed as the average accuracies over 50(=10×5) chunks. Their means and SD are reported.

For two-class problems, a detailed report of classification results is the confusion matrix consisting of four numbers: true positive (TP), false positive (FP), true negative (TN) and false negative (FN). Based on the confusion matrix, one can compute different measures to summarise the results, such as recall=TP/(TP+FN) and precision=TP/(TP+FP). Since our case–control study has an artificial disease prevalence and an unrepresentative disease spectrum, our estimates of precision and recall are not applicable outside of this study. The true disease prevalence should be considered in estimates of recall and precision for a clinical setting. If there is some changeable parameter (threshold) to influence the final decision, one can obtain a visual analysis from computing the receiver operation characteristics (ROC) curve (a sequence of pairs of FP rate and TP rate) by changing the parameter values. Another popular measure is the area under the curve by reducing the ROC curve to a single number,22 but this measure requires classifiers to change their parameters continuously to yield a function of sensitivity and specificity, which is rather demanding for our developed methods. For detailed description of different approaches to measure classifier performance, one can refer to the Section 19.7 of E. Alpaydin’s book.23 Running time of cross-validation is used to compare the training and prediction time of each algorithm. All algorithms were implemented in Matlab on a 2.4 GHz i7-CPU machine with 8 GB memory.

Dimensionality reduction

The main objective of dimensionality reduction is to reduce the computational burden of classifiers and to alleviate the effects of data noise. In the standard paradigm of machine learning and pattern recognition, it is typical to include a stage of feature selection or transformation for dimensionality reduction before the procedure of classification is performed, especially in those applications with hundreds of input features (or variables). In our case, there are only 38 numerical variables for measuring concentrations of the selected chemical elements, but we still apply these dimensionality reduction methods to see if any redundancy can be removed and some improvements in accuracy can be gained. Six methods are used for this objective, including Fisher feature selection (FFS), principal component analysis, Fisher discriminant analysis (FDA) with its variant (FDAx, where the between-class scatter matrix is simply replaced by the ‘total’ mixture scatter matrix to allow more non-zero eigen values), locality preserving projection and factor analysis.24–28 Apart from FFS in which a small set of features are chosen directly, other five methods aim at transforming the original features into a low-dimensional embedding space. Note that the Random Forests (RF) method has a built-in mechanism of random feature selection, but it is of interest to see whether the new feature representations from dimensionality reduction approaches can bring some improvements in accuracies for RF. The reader may refer to the review and systematic comparison of different methods for dimensionality reduction.29

Classification

A recent comprehensive performance evaluation on the whole UCI classification datasets (available at http://archive.ics.uci.edu/ml) showed that the top rank of the best classifiers includes RF, SVM, NN and boosting. In our EC diagnosis system, we apply six kinds of classifiers. Besides RF,30 SVM,31 NN,32 AdaBoost (AB),33 two traditional classifiers including Naive Bayes (NB) and logistic regression (LR) are used for further comparisons.34l

Statistical hypothesis testing

To compare with machine learning approaches, traditional methods of statistical hypothesis testing are also employed. Through the following two hypothesis testing methods, we are able to determine the significant difference in two means of elements’ concentrations between patients with EC and healthy comparison subjects. Student’s t-test involving two samples is a most widely used testing in biomedical experiments, which is based on the assumption that the data follow the Gaussian (normal) distribution (with equal variances or unequal variances). However, this Gaussian assumption appears to be stringent for most real-world data sets. The counterpart of the t-test is the non-parametric Wilcoxon rank-sum test (also known as the Mann-Whitney U test), which is the most widely used distribution-free hypothesis test. One can refer to Chapter 7 of Thomas Glover and Kevin Mitchell’s textbook35 (Tests of Hypothesis Involving Two Samples) for more details of the two commonly used testing methods.

Results

Table 1 shows the demographic characteristics of the subjects included in this study. A total of 100 patients with ESCC and 100 age, sex and region-controlled normal control subjects were enrolled. In addition, history of regular alcohol consumption and history of regular cigarette smoking were similar between patients with ESCC and controls (p=0.856 and p=0.669, respectively). In contrast, more cases had a family history of ESCC as compared with controls (29.0% vs 17.0%, p<0.05).

Table 1

Demographic characteristics of normal controls and patients with ESCC from Anyang, China, 2010

The data from patients with ESCC are a 200×43 matrix consisting of 100 patients and 100 healthy subjects with 43 feature variables. We test the aforementioned six classifiers on the original feature space as well as on the embedding spaces by using six dimensionality reduction methods. The dimension of the most embedding spaces is set as 10 for a trade-off between accuracy and complexity, except that the FDA algorithm can only project into 1D because of its inherent limitation for two-class classification problems.

The averaged classification accuracies and running seconds are reported in table 2 based on 10 rounds of fivefold cross validation. We can see that RF achieves the best accuracy of 98.38% on the original feature vectors (without dimensionality reduction), and SVM outperforms other classifiers by yielding 96.56% on embedding spaces with dimensionality reduction. In contrast, NB is clearly inferior to other five classifiers possibly because of the ‘bogus’ conditional independence assumption. Besides, three dimensionality reduction methods based on Fisher discriminant ratios (namely FFS, FDA and FDAx) are often more favourable than other three dimensionality reduction methods without the use of discriminant information. In terms of running time, we found that the dimensionality reduction procedure can evidently speed up the running time for each classifier; on the other hand, three classifiers (namely NB, LR and SVM) are faster than the remaining three classifiers.

Table 2

Classification accuracies (in percentage) and runtime (in seconds) of the patient with ESCC

Through dimension reduction, the learnt parameters of FDA and FFS can reflect the correlation or significance of the chemical elements to EC. Specifically, the projection vector w of FDA can be regarded as linear weights to the original features; thus, larger weights mean higher contributions. For FFS, the F statistic indicates the discriminant capability of each chemical element: the larger the F is, the more difference there is on that chosen element. As shown in table 3, we can find that strontium (Sr) and sulfur (S) are the top two elements with more discriminant information. Besides, other important elements may include P, U, Ca, Tl, Bi and Hg.

Table 3

The projection coefficients w in Fisher discriminant analysis and the Fisher discriminant ratio F used in Fisher feature selection

To further examine the discriminant capability, we draw the concentration distributions of these eight important elements and of one unimportant element (Se) for comparison. As shown in figure 1, we can see that the differences between cases and controls on these eight elements are significant, whereas the difference on the element Se is not so much. The top part of table 4 lists the classification accuracies based on each single element.

Figure 1

Concentration distributions of eight important elements and one unimportant element (Se) for patients with oesophageal cancer and healthy controls.

Table 4

Classification accuracies (in percentage) based on single, pair and triple elements

As we can see, all six classifiers can achieve accuracies more than 90% based on the single most important element Sr. Other two elements with distinctive difference are S and P, providing accuracies around 80%.

We also investigate whether the relation of any two elements is different between cases and controls. Figure 2 shows the distributions in normalised concentration for four selected pairs of elements, including Sr-S, P-U, Ca-Tl and Bi-Hg. We can see that these 2D scatter plots are highly separable, though no straightforward functions are available to describe the pairwise relationship. The middle part of table 4 displays the classification accuracies based on a small set of element pairs. It indicates that the pair of Sr and U achieves the best performance for four classifiers, whereas another pair of Sr and P outperforms other pairs for two classifiers. It appears that Sr and U are most diverse elements and they complement each other, though classification performance on single U is not satisfactory.

Figure 2

Distributions in normalised concentration for pairs of elements (Sr-S, U-P, Tl-Ca and Bi-Hg).

When considering the scatter plots of any three important elements, the separability can be improved as shown in figure 3. Almost any linear classifier can achieve good performance based on these triples of elements. The bottom part of table 4 displays classification results based on a small set of element triples. The triple composed of Sr, U and P achieves the best for five classifiers except RF.

Figure 3

Distributions in normalised concentration for combinations of three elements (Sr-U-P, Sr-U-S, Sr-Ca-P and Sr-P-S).

Finally, we attempt to identify the importance of different feature subsets by removing variables from the whole variable set (table 5). Note that this procedure is just a simple way to exclude a specific feature subset, not identical to the complex backward elimination method in the literature of variable selection that will be dependent on the order. When eliminating the five demographic variables (including age, gender, smoking history, drinking history and family history), the classification accuracies are improved for four classifiers compared with the results of using the whole set of original features. For other two classifiers, namely AB and RF, the classification performance only degenerates slightly. It seems that the connections between demographic characteristics and EC are very weak or loose. On the other hand, removing six low-concentration elements becomes detrimental to the classification performance, though the accuracy decreases are not much.

Table 5

Classification accuracies (in percentage) on removing a subset of features

In order to compare to machine learning methods, the traditional hypothesis testing method, including t-test and rank-sum test, is applied. As shown in table 6, at the confidence level of 5%, more than half of the features are significantly different between cases and controls in terms of t-test and rank-sum test. Also we can notice that the means and SD differ significantly in many features between cases and controls, partly due to singular values or ‘outliers’ in the data. For example, on some of the elements, the highest concentration is more than thousand times of the average value, which makes a great bias over the hypothesis testing results. By contrast, machine learning methods are more robust when dealing with such outliers in data.

Table 6

The results of hypothesis tests: means and standard deviations of cases and controls, and the p value of t-test and rank-sum test (RS test) (the p values less than alpha=5% are boldfaced)

Discussion

In this article, we present a study of chemical elements in serum for patients with non small cell carcinoma (NSCC) based on supervised learning methods. As shown in table 6, at the level of 5%, more than half of chemical elements are shown to be statistically significantly different between cases and controls. To our knowledge, previous relevant studies only focused on Se, Cu and Zn.12 36 37 Similarly, in our study, we observed lower levels of Se and Zn and higher levels of Cu among patients with ESCC compared with controls. One possible explanation is that Se is a primary component of selenoproteins, of which antioxidant role can regulate the redox status of some molecules and dampen the propagation of free radicals and reactive oxygen species,38 and Zn has a number of vital functions including cell proliferation, reproduction, immune function and defence against free radicals.39 Excess Cu has been known to be a potent oxidant causing the generation of ROS in the cells.40

However, the values for these studied elements varied significantly among different studies which conducted in different countries, or regions. These inconsistent findings might result from racial factors and geographic variation, and varied sample sizes of relevant studies. Therefore, we used a case–control study matched by age, sex and region in order to make the cases and controls comparable and try to control the potential confounders, such as region, smoking and drinking. In addition, in the studies of chemical elements and health, the means and SD usually differ enormously between the case and control groups over many features. This is mainly because of singular values (or outliers) incurred in measurements to the data. For example, on certain elements, the highest concentration is more than thousand times of the average value, which influenced the test results greatly. However, machine learning methods are much more robust dealing with such ‘outliers’ problem.

In this study, our analysis based on machine learning methods gives prosperous results. Specifically, RF achieves the best accuracy of 98.38% on the original feature vectors (without dimensionality reduction), and SVM outperforms other classifiers by yielding 96.56% on embedding spaces with dimensionality reduction. All six classifiers can achieve accuracies more than 90% based on the most important single element Sr. The other two elements with distinctive difference are S and P, providing accuracies around 80%.

The contributions of this paper are twofold. First, we provide a principled framework to comprehensively investigate the chemical elements in blood serum of patients with ESCC; this framework can be easily extended to blood serum of other patients with cancer, or even for general diseases. The main impediment of coping with tens of chemical elements (38 in our study) can be efficiently solved by hypothesis testing and machine learning methods nowadays with ordinary computation platforms. Second, we find that great differences exist in element concentrations between patients with ESCC and healthy comparison subjects. Consequently, approaches such as ‘crude’ diagnosis before gold-standard examination (using biopsy), and early detection of EC, can have potential applications.

There are several merits in our proposed framework. First, the classification accuracies achieved by machine learning methods are remarkably higher than most empirical decisions made by doctors on this small corpus of 100 patients with EC and 100 healthy comparison subjects. Furthermore, the test error rates can tell us the confidence level of the predictions whenever a large data set is available to conduct this chemical element analysis. Second, the expense for this analysis is much lower than that of the gold-standard biopsy for EC and other cancers, though this analysis may only serve as ‘crude’ diagnosis or prescreening. Third, the time to obtain analysis results in element concentrations by this approach is much shorter than biopsy. Fourth, the blood sample acquisition is lowly invasive, and thus can be easily performed in annual health examination for early-stage precaution. This point is particularly essential for most patients with cancer, as locally advanced cancer or distant metastases in late stages are associated with high mortality.

There are also some hidden pitfalls in this framework which deserve our cautions. First, possible confounding biases may not be controlled or avoided due to the absence of such factors, including body mass index, dietary intakes and potential regional carcinogen contacts. Moreover, even if these factors were comparable, it remains impossible to eliminate the possibility that diverse genetic features might be associated with ESCC. Second, the present work is only based on a relatively small patient cohort. This framework should be evaluated on a larger patient cohort before any real clinical applications are performed in the future. Third, due to the retrospective design nature of this study, the results showed associations only and gave no cause–effect clues; they may not be generalised to other populations, like Europe or North America, with regard to ESCC. Fourth, among patients with ESCC, lower levels of some elements might be caused by eating difficulty. But on the other hand, higher levels of concentrations in vanadium, manganese and chromium were also observed. Therefore, the differences in element concentrations cannot be simply attributed to eating difficulty. This finding might give some useful hints or ideas for the study of ESCC. Fifth, this study cannot produce clinically relevant estimates of diagnostic accuracy, because clinically relevant estimates would have to come from a study that recruited a clinically relevant patient sample. In addition, case–control studies almost always overestimate sensitivity, specificity and accuracy. Therefore, these classifiers need to be tested in a clinical setting before their use can be recommended. Finally, the contamination of heavy metal elements has become a severe problem in mainland China; however, this effect of heavy metal might be adjusted by using a healthy control group which matches the present patients with ESCC in age, sex and residential areas, as we did in this study. If other physical factors such as age, sex and region are the same or similar, it is safe to attribute the great differences in element concentration between the case group and the control group to the presence of the particular disease, not to other unrelated problem like heavy metal.

The proposed framework may have several new emerging applications. One possibility is to modify the chemical elements in pharmacy for enhancing the levels of certain element concentrations of a patient into a normal interval. Another perspective is to provide a rebalanced diet in nutrition for patients. These new applications will depend on the thorough and deep analysis of the chemical elements in serum among patients and healthy controls.

Conclusions

These results suggested element profile differences between patients with ESCC and controls, which indicated some potential promising applications in diagnosis, prognosis, pharmacy and nutrition of ESCC. In the future, the results of the analyses will be useful in designs that have larger sample sizes. However, the results should be interpreted with caution due to the retrospective design nature, limited sample size and the lack of several potential confounding factors, such as obesity, nutritional status, and fruit and vegetable consumption and potential regional carcinogen contacts.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.

Footnotes

  • Contributors JYW, TBL and ZHH proposed and supervised the project. TL and YCL designed and carried out the experiments and analysed the data. LLY, TBL and CTZ contributed to the sample collection, chemical elements measurement, quality control and finding of related literature. ZXC participated in the discussion, helped to improve the proposed methods and copyedited the manuscript. TL, YCL and TBL wrote the manuscript.

  • Funding This work was supported by the National Natural Science Foundation of China under Grant No. 61375051, 61075119 and 81473033, and the Seeding Grant for Medicine and Information Sciences of Peking University under Grant No. 2014-MI-21.

  • Competing interests None declared.

  • Patient consent Obtained.

  • Ethics approval This study was approved by the Institutional Review Board of Peking University School of Oncology, Beijing.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Additional data are available by emailing wjy@bjmu.edu.cn.