Download Supplementary Methods - Cancer Prevention Research

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microevolution wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Metagenomics wikipedia , lookup

Designer baby wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Gene expression programming wikipedia , lookup

Gene expression profiling wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
Supplementary Methods
Analysis of 51 SSA/P signature genes in published microarray data of serrated polyps
Quantile normalized expression data for each signature gene was downloaded
from the Gene Expression Omnibus (GEO) under accession number GSE43841.
Expression data from six SSA/Ps, six MVHPs and six control colon FFPE samples
(three right and three left) was evaluated for expression of our gene markers.
Hierarchical clustering of log2 ratio values comparing each individual colon sample
(SSA/P – sessile serrated adenoma/polyp, MVHP – microvesicular hyperplastic polyp,
CTRL – control colon) to the mean of all 18 colon samples is shown. Red and green
denote overexpression and underexpression, respectively. Clustering was performed
using a correlation metric and complete linkage.
Sensitivity and specificity of a seven-gene panel
It is well known that application of an algorithm to the data on which it was
trained gives an overly optimistic estimate of performance. Cross validation is designed
to give a more accurate estimate of performance using training data sets only slightly
smaller than the original data. Briefly K-fold cross-validation works as follows. The
samples are partitioned into K complementary subsets of roughly equal size. Each “fold”
consists of training the algorithm on a training data set formed from the union of K-1 of
these subsets, and then validating on the remaining portion, called the testing data set.
This procedure is repeated K times such a way that each sample in a testing data set
exactly once. For n-fold cross validation each testing data set consists of a single
sample.
qPCR validation of four signature genes FSCN1, MUC6, SEMG1 and ZIC5
mRNA expression for each gene was determined using commercially available
TaqMan gene expression assays (Invitrogen) and a Applied Biosystems 7900HT realtime PCR instrument. 10ul qPCR reactions were performed with forward and reverse
primers, internal probe, master mix and 10-15 ng cDNA. cDNA was made from total
RNA using the High Capacity RNA to cDNA kit (Invitrogen). A total of 73 samples were
analyzed, 21 SSA/Ps, 12 HPs, 17 uninvolved and 23 control colon. Beta-actin was used
a reference and control colon as the baseline for determining fold change using the
ΔΔCT method. Statistical significance was determined by the non-parametric Mann
Whitney U test.