* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Top Scoring Pair
Transposable element wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Genomic imprinting wikipedia , lookup
Genome (book) wikipedia , lookup
Gene desert wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Gene nomenclature wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Genome evolution wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Microevolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Designer baby wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene expression programming wikipedia , lookup
Classification Using Top Scoring Pair Based Methods Tina Gui Outline Introduction Top Scoring Pair Experiments Design Future Work Conclusion Introduction Using DNA microarray technology, the limitations of current methods are1: 1. Small Samples 2. Lack of Interpretability Objective: Differentiate between two classes by finding pairs of genes whose expression levels typically invert from one class to the other1. 1. D. Geman, C. d'Avignon, D. Naiman and R. Winslow (2004). "Classifying gene expression profiles from pairwise mRNA comparisons". Approaches Rank-Based Approach Drawback: Information is lost using this procedure Comparison-Based Approach In some cases, accurate prediction can be achieved by comparing the expression levels of a single pair of genes Simple example to classifying gene expression profiles - Top Scoring Pair (TSP) Classifier Top Scoring Pair G genes whose expression levels X = {X1, X2, … XG} Each profile X has a true class label in {1, 2, … C} Ex. C = 2 Marker Gene Pairs (i, j) a significant difference in the probability of Xi < Xj from class 1 to class 2 profile classification is then based on the collection of distinguished pairs Top Scoring Pair The quantities of interest pij(c) = P (Xi < Xj|c), c = 1, 2 (P, probabilities of observing Xi < Xj in each class) Expression values Δij = |pij(1) − pij(2)| (Δij , the “Score” of (i, j). ) Top Scoring Pair Rank the Expression Values Rank the scores Δij from largest-to-smallest Select all pairs achieving the Top score. Example of scoring a gene pair: 52 profiles -> class 1 50 profiles -> class 2 pij(1) = 50/52 pij(2) = 3/50 Top Scoring Pair Classifier Computing the score Notes: Since pij(1) > pij(2), the classifier based on this gene pair votes for class 1 for a profile with Xi < Xj and for class 2 otherwise K-TSP Classifier In some instances, the TSPs may change when the training data are perturbed by adding or deleting a few examples K-TSP classifier uses the k top scoring disjoint gene pairs from the list Increasing the accuracy of the TSP classifier Experiments Design Baseline Augmented Space Alternate Space Baseline Raw Data A1 .. A13 M .. A21 .. A45 .. AM TSP classifier (A13 : A45) (A7 : A21) (A1 : A72) (A1 : A25 ) : : (Ax : Ay) N Augment Adding top ranked pairs A1 .. A72 A7_45 A13_21 A1_72 .. M+K Aa_b K-TSP classifier (A13 : A45) (A7 : A21) (A1 : A72) (A1 : A25 ) : : (Aa : Ab) N K Alteration Deal with the K-TSP columns only A7_45 N A13_21 A1_72 K .. Ax_y Future Work Combination of Decision Tree and Top Scoring Pairs1 1. Czajkowski M, Krȩtowski M. (2011) “Top Scoring Pair Decision Tree for Gene Expression Data Analysis,” Conclusion TSP classifier predictions are based entirely on the top-scoring pairs. Beauty of Top Scoring Pair - Simplicity Main Goal - Improve the classification accuracy