* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 341- INTRODUCTION TO BIOINFORMATICS Overview of the …
Survey
Document related concepts
Transcript
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1 Introduction to Biology • Biological Terms: DNA, RNA, Gene, Gnome, Codon, Protein, Proteome, Amino acids • Central Dogma: – The transcription of DNA to mRNA. – Translation of the mRNA to a protein 2 Sequencing and Sequence Alignment • Sequences, Mutations, and Evolution • Sequencing technologies: – Sanger Sequencing – Next Generation Sequencing • Sequence Alignment: – Why is it needed? – Sequence Similarity Scoring Matrices; e.g. BLOSUM – BLAST Algorithm – Accuracy, Sensitivity and Selectivity of sequence searching – Dynamic Programming based Sequence Alignment • Algorithms: Needleman-Wunsch (global alignment) , • Smith-Waterman(local alignment) 3 Functional Genomic and Microarray Analysis • What is gene expression? • What are microarrays? • What are the steps of microarray experiments? • Statistical significance tests that are used for evaluating the results of microarray experiments; e.g., t -test. • Data clustering / classification – Distance Measures – Hierarchical clustering – Decision trees – K-means/K-medoids clustering – K-nearest neighbourhood clustering 4 Introduction to Graph Theory • Basic definitions on graphs and graph types; e.g. bi -partite graphs, cycles, trees, clique, path, subgraphs, connected component, isomorphism, automorphism, automorphism orbits. • Minimum spanning trees – Kruskal’s Algorithm – Prim’s Algorithm • Graph representations: – Adjacency Matrix – Adjacency List • Running times of Algorithms; upper bounds, lower bounds. • Complexity classes of problems: P, NP, NP-Hard • Graph Traversal Algorithms: – Breadth-first search – Depth-first search • Shortest Path Algorithms – Dijsktra’s Algorithm – Floyd-Warshall algorithm 5 Introduction to Biological Networks • Different types of biological networks: – Metabolic Networks – Transcription Regulation Networks – Cell Signalling networks – Protein-protein interaction Networks – Genetic interaction Networks – Protein Structure Networks • What do they represent? • What are the relations among these network types? • What are the available sources of data for obtaining these networks? • What are the possible sources of bias in these networks? 6 Network Properties • The global network properties: – Degree / Degree Distribution – Clustering Coefficient – Diameter / Shortest Path Length Distribution – Centrality Measures: • Degree Centrality • Betweenness Centrality • Closeness Centrality • Eccentricity Centrality • The local network properties: – Network motifs – Graphlets / Graphlet Degree Distributions 7 Network Models • Different types of network models: – Erdos-Renyi Networks (ER) – Small-world Networks – Scale-free Networks (SF) – Hierarchical Networks – Geometric Networks (GEO) – Generalized Random Networks (ERDD) – Geometric Networks with Gene Duplication / Divergence Events(GEOGD) – Scale-free Networks with Gene Duplications/Divergence Events (SFGD) – Stickiness-index based Networks (STICKY) 8 Network Comparison / Alignment • Paralogy / Orthology • Network: alignment, integration, querying • Network Alignment Algorithms: – Global / Local Alignment – Pairwise Alignment / Multiple Alignment – Functional vs. topological information • Alignment quality evaluations measures: o o o o Edge Correctness, Size of Common Connected Subgraphs, attributed to chance, biological quality 9 Network Comparison / Alignment • Key algorithmic components: – Similarity between nodes • Functional • Topological – Search (identification of high-scoring alignments) • Seed-and-extend • Finding an optimal alignment 10 10 Protein 3D Structure • Four levels of protein structure • Available data • The structure function paradigm • Rigid and flexible structure comparison – Algorithms 11 Network Data Integration • Today’s lecture 12