* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Methods
Survey
Document related concepts
Transcript
Gene-Disease Associations Based on network 报告人:李金金 Contents 1 Background 2 Heterogeneous network 3 Methods Background Correctly identifying association of genes with diseases has long been a goal in biology. Identifying association of genes with diseases has contributed to improving medical care and understanding of gene functions and interactions. Clinical diseases are characterized by distinct phenotypes. To identity disease genes, the relationship between genes and phenotypes is involved. Background Problems Gene Disea -se Association Pheno -type Gene Pheno -type Construction heterogeous network Gene network based on HPRD g3 g1 g2 g5 g4 g7 g6 AG Construction heterogeous network Phenotype network using MinMiner p1 p2 AP p4 p5 Construction heterogeous network Gene-Phenotype network based on OMIM p1 p2 p4 p5 B g3 g1 g2 g5 g4 g7 g6 Construction heterogeous network AG ( n*n ) AP ( m*m ) B( n*m ) AG A T B B AP Methods Katz CATAPULT Methods CIPHER GeneWalker Prince RWRH Methods Methods Katz is successfully applied for link prediction in social networks. CATAPULT is a supervised learning method. Features are derived from hybrid walks through the heterogeneous network. Katz g1 g2 g3 g5 g4 g6 0 0 1 A 0 1 0 0 0 1 1 0 1 1 1 0 1 1 1 0 1 1 0 0 1 1 0 1 0 0 0 0 1 1 1 0 0 Katz g1 g2 g3 g5 2 1 1 A2 1 1 1 g4 g6 3 4 AAA 5 …… 1 3 2 2 1 2 1 2 5 2 1 2 1 2 2 3 1 2 1 1 1 1 2 1 1 2 2 2 1 3 Katz How to get the similarity matrix? k Sij l ( A ) ij , l l l 1 Katz measure: k S katz l l A ( I A) I , l l 1 l l 0 1 1 A 2 Small values of k (k=3 or k=4) are known to yield competitive performance in the task of recommending similar nodes. Katz on the heterogeneous network Adjacency matrix of heterogeneous network: AG A T B B BHS APHS AP 0 B AP BS 0 APS AG gene-gene network B the bipartite network genes and phenotypes APHS the similarity matrix of human diseases APS the similarity matrix of phenotypes of other species Katz on the heterogeneous network Katz similarity measure specialized to A: k S Katz ( A) ij l ( Al ) ij l 1 K=3,the similarities between gene nodes and human disease nodes could be denoted by S HKatz s ( A) 2 S HKatz ( A ) B ( AG BHs BHs APHs ) s Hs 3 2 ( BBT BHs AG BHs AG BHs APHs BHs APHs ) 2 CATAPULT How to train a biased SVM? T the number of bootstraps A the sets of positive the set of unlabeled gene-phenotype pairs n+ the number of examples in A Step 1: Draw a bootstrap sample U of size n+ . Step 2: Train a linear classifier θ using the positive training examples A and U as negative examples. CATAPULT How to train a biased SVM? Step 2: Training classifier CATAPULT How to train a biased SVM? Step 3: For any x U \ U t update: