Download Multidimensional Analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Site-specific recombinase technology wikipedia , lookup

BRCA mutation wikipedia , lookup

Designer baby wikipedia , lookup

Gene wikipedia , lookup

Minimal genome wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Microevolution wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Ridge (biology) wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Genome (book) wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Gene expression profiling wikipedia , lookup

Oncogenomics wikipedia , lookup

RNA-Seq wikipedia , lookup

NEDD9 wikipedia , lookup

Transcript
Multidimensional Analysis
If you are comparing more than two
conditions (for example 10 types of
cancer) or if you are looking at a time
series (cell cycle or progression of
cancer) you are looking at a
multidimensional problem
Example: 6000 genes in 10 patients
• 6000 points in 10dimensional space
(gene view)
• 10 points in 6000dimensional space
(patient view)
Reduction of dimensions:
• Principal Component
Analysis (PCA)
• Clustering
• Correspondence
Analysis
Patient view
Classification
1: patients surviving 5 years after breast cancer surgery
2: patients dead within 5 years of breast cancer surgery
Other classifiers
• Neural Networks
• Support Vector Machines
• Other classifiers from statistical literature
Issues in building a classifier
• Feature selection: a selected group of genes
may be optimal (t-test)
• Independent validation: you must test the
classifier on samples that were not used for
feature selection or for building the
classifier (training set - test set or leave-oneout crossvalidation)
Promoter Analysis
• Genes that pass the significance test are clustered
and their corresponding promoter regions
extracted.
• Regions are searched for potential transcription
factor binding sites that they have in common
• Saco-patterns looks for exactly identical patterns
• Gibbs sampler allows for degeneracy of patterns
with weight matrix description
• Transfac is a database of known transcription
factor binding sites.
Patterns can be assessed based on overrepresentation in cluster
relative to background set.