Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Review of Course Topics (Lecture for CS498-CXZ Algorithms in Bioinformatics) Dec. 8, 2005 ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign Key Algorithms • • • • • • • • • DNA Sequencing: – Shortest superstring problem & Eulerian graph approach – – Lander-Waterman model Overlap-Layout-Consensus Gene identification – – Exon chaining (similarity) Likelihood ratio (statistical) Pairwise alignment – – Dynamic programming Scoring (scoring matrix, affine gap) – Variations (Local, Smith-Waterman algorithm) Multiple sequence alignment – Exact: Multidimensional dynamic programming – Inexact: Feng-Doolittle progressive alignment Hidden Markov models – Finding most likely path: Viterbi – – Computing sequence probabilities: Forward/Backward Supervised learning – ProfileHMM Microarray data analysis – Agglomerative Hierarchical Clustering: Single-link, complete-link, avg/group link – K-means clustering Phylogenetic tree construction – Neighbor-joining – Maximum parsimony Regulatory motifs – Deterministic: Consensus – Sampling: Gibbs Sampler Genome rearrangements – Sort by reversal (breakpoint elimination) Typical Steps to Solve a Bioinformatics Problem • Problem formulation – Understand the original biology problem – Formalize the problem as a computational problem • – Must make assumptions (many are unrealistic) Find algorithms to solve the problem – Brute force is often too slow or consumes too much memory – Developing efficient algorithms is the main challenges • – When it’s impossible to find an extract solution quickly, think about finding an approximate solution Evaluate the algorithms and – Further improve the algorithms – Further improve the problem formulation What To Do Next? • Research Track: – For undergraduate students: consider graduate schools (many now have Ph.D./MS in Bioinformatics) • • – For graduate students: find a research advisor in this direction (UIUC is hiring more faculty in bioinformatics) Industry Track: – Pharmaceutical industry is the main job market Further Training: – Molecular biology – Advanced/specialized bioinformatics courses – Machine learning – Data mining (relational and textual) – Statistics – Databases/Web search Course Evaluation Thank You! Good Luck for the Final!