* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Bioinformatics
Biology and consumer behaviour wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Metagenomics wikipedia , lookup
Genetic engineering wikipedia , lookup
Designer baby wikipedia , lookup
Gene expression programming wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Non-coding DNA wikipedia , lookup
Synthetic biology wikipedia , lookup
History of genetic engineering wikipedia , lookup
Microevolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Bioinformatics Sean Langford, Larry Hale What is it? Bioinformatics is a scientific field involving many disciplines that focuses on the development of methods for storing, retrieving, organizing, and analyzing data from biological sources, usually sources that are of a cellular or genetic nature. What is it? (cont.) A major focus in bioinformatics is the production of useful software tools for generating biological knowledge using advanced skills in a variety of computer science, mathematics and engineering fields. What it does Bioinformatics uses a variety of techniques to develop software tools useful in producing valuable biological knowledge. Similar to Biological Computation. History Bioinformatics was coined by Paulien Hogeweg in 1970 and referred to the study of information processes in biological systems. Computers became necessary in the field of molecular biology when protein sequences became available in the 1950s and in genetics in 1982 as more genome sequences became available. Bioinformatics vs. Biological Computation The two fields have similar aims but the major difference is in scale. Bioinformatics deals with basic biological data and pays attention to details while Biological Computation is a subset of CS that builds large scale theoretical models of biological systems in an attempt to expand understanding of these systems in an abstract view. Goals Bioinformatics is now focused on the creation and advancement of databases, algorithms, computational and statistical techniques, and theory to solve formal and practical problems arising from the management and analysis of biological data. Goals (cont.) Some problems thus far addressed in the pursuit of the current goal of Bioinformatics involve the production of GMOs in order to protect crops and provide gene therapy for a variety of genetic disorders. Algorithms Rather than to list specific algorithms used, it is more appropriate to consider what algorithms and types of algorithms that are not used. Bioinformatics as a field is considerably broad and uses a large number of algorithms to accomplish an extremely large number of tasks. Analyzing Data Very important goal of Bioinformatics at the moment. Uses algorithms involving Artificial Intelligence, Soft Computing, Data Mining, Image Processing, and Simulation. Heavily uses Discrete Mathematics and Statistics. Practical Example A prime example of an algorithm used in Bioinformatics is LZW algorithm for the compression and decompression of genetic strings in order to more efficiently store the information. A demonstration of this usage was seen in problem four of our Homework. Problem 4 DNA sequences use the alphabet {A, C, G, T}. Use LZW algorithm to compress the following DNA sequence: ATGGAAGGAACTAATGGCCACCAAAAC GGTTCATTTTGCTTGTCCACTGCCAAGGG AAATAATGATCCCTTGAACTGGGGAGCG GCGGCGGAGGCA Compression Answer as best agreed on by presenters: 0, 3, 2, 2, 0, 0, 6, 8, 1, 3, 8, 5, 2, 1, 1, 0, 17, 8, 11, 6, 3, 3, 18, 24, 24, 16, 28, 25, 18, 12, 16, 18, 9, 14, 7, 31, 12, 5, 11, 15, 10, 16, 6, 1, 48, 16, 0 Practical Example Part 2 Another example of a common algorithm used in Bioinformatics is the use of algorithms designed to find the longest common subsequence. This is also demonstrated within the homework in problem 1. Problem 1 Given the following DNA sequences, S1 = (ATGGAACGAACT) and S2=(TTGCCAGTAC), find the Longest Common Subsequence (LCS) of S1 and S2 using the table below. Show the values and arrows in each cell and circle the cells that are part of the LCS. Write the LCS. Longest Common Subsequence Answer Best Agreed on by presenters. T- G- - - G- AC Questions? Dost thou haveth any?