Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Application of Unstructured Learning in Computational Biology Tony C Smith Department of Computer Science University of Waikato [email protected] Computability Before computers were built, mathematicians knew what they could do • arithmetic (e.g. missile trajectories) • search (e.g. keys for secret codes) • sort (census information) • … anything with a mathematical algorithm Unstructured learning in computational biology Tony C Smith Artificial Intelligence Computers do things only human brains can otherwise do expert Unstructured learning in computational biology expert Tony C Smith Artificial Intelligence Computers do things only human brains can otherwise do expert system Unstructured learning in computational biology expert Tony C Smith Artificial Intelligence Computers do things only human brains can otherwise do expert system Unstructured learning in computational biology learning system Tony C Smith Machine learning What is machine learning? creating computer programs that get better with experience learn how to make expert judgments discover previously hidden, potentially useful information (data mining) How does it work? user provides learning system with examples of concept to be learned induction algorithm infers a characteristic model of the examples model is used to predict whether or not future novel instances are also examples – and it does this very consistently, and very, very quickly! Unstructured learning in computational biology Tony C Smith Structured learning Mushroom Data weight Weight Damage Dirt Firmness Quality heavy heavy normal light Light normal heavy ... high high high medium clear clear medium mild mild mild mild clean clean mild hard soft hard hard hard soft hard poor poor good good good poor poor Unstructured learning in computational biology heavy dirt normal light firmness mild clean poor good hard good good soft poor Tony C Smith Unstructured learning data does not have fixed fields with specific values examples: images, continuous signals, expression data, text learning proceeds by correlating the presence or absence of any and all salient attributes Document Classification given examples of documents covering some topic, learn a semantic model that can recognize whether or not other documents are relevant prioritize them: i.e. quantify “how relevant” documents are to the topic not limited to keywords (nor is it misled by them) adapt to the user’s needs (ephemeral or long-term) Unstructured learning in computational biology Tony C Smith Document classification demo Unstructured learning in computational biology Tony C Smith bioinformatics Finding genes Determining gene roles Determining protein functions •Empirical tests •Sequence similarity comparison •Literature Unstructured learning in computational biology Tony C Smith GO-KDS demo Unstructured learning in computational biology Tony C Smith Amino Acid R group Amide group Unstructured learning in computational biology Carboxyl group Tony C Smith Amino Acid tyrosine glycine Unstructured learning in computational biology Tony C Smith DNA encodes amino acids Unstructured learning in computational biology Tony C Smith Unstructured learning in computational biology Tony C Smith Unstructured learning in computational biology Tony C Smith Unstructured learning in computational biology Tony C Smith Rasmol demo Unstructured learning in computational biology Tony C Smith Biotechnology Biologists know proteins, computer scientists know machine learning Together, they can find out a lot of hidden information about genes and proteins Biotechnology is a multi-billion dollar industry Biotechnology is one of the best funded areas of scientific research Unstructured learning in computational biology Tony C Smith