Download Supplemental digital content 4: Supplemental Text S4. Classification

Supplemental digital content 4: Supplemental Text S4. Classification Algorithm and Parameter Setting Support Vector Machine The Support Vector Machine (SVM) is a supervised learning method that generates input-output mapping functions from a set of labeled training data. The mapping function can be either a classification function or a regression function. For classification, nonlinear kernel functions are often used to transform input data to a high-dimensional feature space in which the input data become more separable compared to the original input space. Maximum-margin hyperplanes are then created. The produced model depends on only a subset of the training data near the class boundaries [1] . In our research, two different kernel functions are selected: linear function and RBF function. Primary parameters are set as follows: Stoping Criteria = 1*10-3；Regression Precision = 0.1；RBF γ = 0.1；γ =1；Bias = 0； Degree=3. Neural Network Neuron is the basic element of a neural network which is a simple virtual device that accepts many inputs, sums them, applies a transfer function, and generates the result, either as a model prediction or as input to other neurons. Many such neurons connected in a systematic way to construct a neural network. The neurons in networks are arranged in layers. Typically, there is one input layer for input neurons, one or more hidden layers for internal processing units, and one output layer for output neurons. Each layer is fully interconnected to the preceding layer and the following layer. The connections between neurons have weights associated with them, which determine the strength of influence one neuron has on another. Information flows from the input layer through the hidden layers to the output layer to generate predictions. The network “learns” to generate better and better prediction, by adjusting the connection weights during training to match predictions to target values for specific records. In our research, the neural networks used are feed-forward neural networks. In feed-forward neural network, the input neurons have their activations set to the values of the encoded input fields. The activation of each neuron in a hidden or output layer is calculated as [2] : (1) where ai is the activation of neuron; i, j is the set of neurons in the preceding layer; wij is the weight of the connection between neuron i and neuron j, oj is the output of neuron j, and σ(x)is the sigmoid or logistic transfer function. (2) Primary parameters are set as follows: Number of Hidden layers = 2; Layer 1 = 20; Layer 2 =15；Persistence =200；α = 0.9； Initial η=0.3；η decay=30；High η=0.1；Low η=0.01. Decision Tree (C4.5) Decision tree learning is a method commonly used in data mining whose goal is to create a model that predicts the value of a target variable based on several input variables. The basic components of decision tree include node, edge and leaf. Each node corresponds to one of the input variables; there are edges to children for each of the possible values of that input variable; each leaf represents a value of the target variable given the values of the input variables represented by the path from the root to the leaf. C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan [3] . C4.5 builds decision trees from a set of training data using the concept of information entropy. The attribute which can most effectively splits the sample set into subsets enriched in one or the other will be selected at each node of the tree. Its criterion is the normalized information gain (difference in entropy) that results from choosing an attribute for splitting the data. The attribute with the highest normalized information gain is chosen to make the decision. The C4.5 algorithm then recurses on the smaller sublists. Primary parameters are set as follows: Maximim Depth=7；Pruning Severity=75；Minimum Percentage of Records in a Node = 0.05; Minimum Percentage of Records for a Split = 0.10. Logistic regression Logistic regression is a statistical technique for classifying records based on values of input fields. Logistic regression is a statistical model for classifying records by input variables. Logistic regression is similar to linear regression but takes a categorical target field instead of a numeric one. It supports two different models: binomial model (two classes) and multinomial model (more than two classes). Logistic regression builds a set of equations which relate the input field values and evaluates the probabilities associated with each of the output field categories. The target category with the highest probability is assigned as the predicted output value for that record. In our research, binomial model was selected because of two kinds of subjects. Primary parameters are set as follows: Singularity Tolerance = 1*10-8; Scale = 1; Maximum Iterations = 100; Log-likelihood Convergence=0; Parameter Convergence = 1*10-6; σ=0. Linear Discriminant Analysis Linear discriminant analysis (LDA) is the methods used in machine learning to find a linear combination of features which characterizes or separates two or more classes of objects. LDA is closely related to ANOVA (analysis of variance) and regression analysis, which also attempt to express one dependent variable as a linear combination of other features or measurements [4] . In the other two methods however, the dependent variable is a numerical quantity, while for LDA it is a categorical variable. For two classes, LDA assumes the conditional probability density functions which are both normally distributed with mean and covariance parameters respectively. Under this assumption, the Bayes optimal solution is to predict points as being from the second class if the log of the likelihood ratios is below some threshold T. Primary parameters are set as follows: Prior Probailites=All Groups Equal; Covariance Matrix=Within-groups; Estimation= Maximum-likelihood Estimators；Tolerance to singularity = 1*10-4. References： [1] Ben-Hur A and Weston J. A User's Guide to Support Vector Machines. Methods Mol Biol. 2010;609:223-39. [2] Fine TL. Feedforward Neural Network Methodology, 3rd ed. New York: Springer-Verlag; 1999. [3] Quinlan JR. C4.5: Programs for Machine Learning. Burlington: Morgan Kaufmann Publishers; 1993. [4] McLachlan GJ. Discriminant Analysis and Statistical Pattern Recognition. Wiley; 2004.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Supplemental digital content 4: Supplemental Text S4. Classification