* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Supplementary Methods - Cancer Prevention Research
Survey
Document related concepts
Microevolution wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Metagenomics wikipedia , lookup
Designer baby wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Gene expression programming wikipedia , lookup
Transcript
Supplementary Methods Analysis of 51 SSA/P signature genes in published microarray data of serrated polyps Quantile normalized expression data for each signature gene was downloaded from the Gene Expression Omnibus (GEO) under accession number GSE43841. Expression data from six SSA/Ps, six MVHPs and six control colon FFPE samples (three right and three left) was evaluated for expression of our gene markers. Hierarchical clustering of log2 ratio values comparing each individual colon sample (SSA/P – sessile serrated adenoma/polyp, MVHP – microvesicular hyperplastic polyp, CTRL – control colon) to the mean of all 18 colon samples is shown. Red and green denote overexpression and underexpression, respectively. Clustering was performed using a correlation metric and complete linkage. Sensitivity and specificity of a seven-gene panel It is well known that application of an algorithm to the data on which it was trained gives an overly optimistic estimate of performance. Cross validation is designed to give a more accurate estimate of performance using training data sets only slightly smaller than the original data. Briefly K-fold cross-validation works as follows. The samples are partitioned into K complementary subsets of roughly equal size. Each “fold” consists of training the algorithm on a training data set formed from the union of K-1 of these subsets, and then validating on the remaining portion, called the testing data set. This procedure is repeated K times such a way that each sample in a testing data set exactly once. For n-fold cross validation each testing data set consists of a single sample. qPCR validation of four signature genes FSCN1, MUC6, SEMG1 and ZIC5 mRNA expression for each gene was determined using commercially available TaqMan gene expression assays (Invitrogen) and a Applied Biosystems 7900HT realtime PCR instrument. 10ul qPCR reactions were performed with forward and reverse primers, internal probe, master mix and 10-15 ng cDNA. cDNA was made from total RNA using the High Capacity RNA to cDNA kit (Invitrogen). A total of 73 samples were analyzed, 21 SSA/Ps, 12 HPs, 17 uninvolved and 23 control colon. Beta-actin was used a reference and control colon as the baseline for determining fold change using the ΔΔCT method. Statistical significance was determined by the non-parametric Mann Whitney U test.