* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Multidimensional Analysis
Site-specific recombinase technology wikipedia , lookup
BRCA mutation wikipedia , lookup
Designer baby wikipedia , lookup
Minimal genome wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Microevolution wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Ridge (biology) wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Genome (book) wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Gene expression profiling wikipedia , lookup
Multidimensional Analysis If you are comparing more than two conditions (for example 10 types of cancer) or if you are looking at a time series (cell cycle or progression of cancer) you are looking at a multidimensional problem Example: 6000 genes in 10 patients • 6000 points in 10dimensional space (gene view) • 10 points in 6000dimensional space (patient view) Reduction of dimensions: • Principal Component Analysis (PCA) • Clustering • Correspondence Analysis Patient view Classification 1: patients surviving 5 years after breast cancer surgery 2: patients dead within 5 years of breast cancer surgery Other classifiers • Neural Networks • Support Vector Machines • Other classifiers from statistical literature Issues in building a classifier • Feature selection: a selected group of genes may be optimal (t-test) • Independent validation: you must test the classifier on samples that were not used for feature selection or for building the classifier (training set - test set or leave-oneout crossvalidation) Promoter Analysis • Genes that pass the significance test are clustered and their corresponding promoter regions extracted. • Regions are searched for potential transcription factor binding sites that they have in common • Saco-patterns looks for exactly identical patterns • Gibbs sampler allows for degeneracy of patterns with weight matrix description • Transfac is a database of known transcription factor binding sites. Patterns can be assessed based on overrepresentation in cluster relative to background set.