* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Opportunities in Bioinformatics for Computer - People
Epigenomics wikipedia , lookup
Gene desert wikipedia , lookup
Gene nomenclature wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Genomic imprinting wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Transposable element wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Genomic library wikipedia , lookup
Protein moonlighting wikipedia , lookup
Metagenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Pathogenomics wikipedia , lookup
Epitranscriptome wikipedia , lookup
Gene expression programming wikipedia , lookup
Genetic engineering wikipedia , lookup
Human genome wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Minimal genome wikipedia , lookup
Non-coding DNA wikipedia , lookup
Genome (book) wikipedia , lookup
Point mutation wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Primary transcript wikipedia , lookup
Genome editing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene expression profiling wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genome evolution wikipedia , lookup
History of genetic engineering wikipedia , lookup
Microevolution wikipedia , lookup
Designer baby wikipedia , lookup
Helitron (biology) wikipedia , lookup
Opportunities in Bioinformatics for Computer Science Lenwood S. Heath Virginia Tech Blacksburg, VA 24061 [email protected] University of Iowa November 16, 2001 November 16, 2001 Slide 1 Overview • The New Biology • Existing bioinformatics tools • Bioinformatics challenges • Bioinformatics at Virginia Tech November 16, 2001 Slide 2 Some Molecular Biology • The instruction set for a cell is contained in its chromosomes. • Each chromosome is a long molecule called DNA. • Each DNA molecule contains 100s or 1000s of genes. • Each gene encodes a protein. • A gene is transcribed to mRNA in the nucleus. • An mRNA is translated to a protein in a ribosome. November 16, 2001 Slide 3 Transcription and Translation Transcription DNA November 16, 2001 Translation mRNA Protein Slide 4 Elaborating Cellular Function Regulation Degradation Transcription DNA Translation mRNA (Genetic Code) Protein Reverse Transcription Functions: • Structure • Catalyze chemical reactions • Respond to environment November 16, 2001 Slide 5 Chromosomes • Long molecules of DNA: 10^4 to 10^8 base pairs • 26 matched pairs in humans • A gene is a subsequence of a chromosome that encodes a protein. • Proteins associated with cell function, structure, and regulation. • Only a fraction of the genes are in use at any time. • Every gene is present in every cell. November 16, 2001 Slide 6 DNA Strand A= adenine complements T= thymine C = cytosine complements G=guanine November 16, 2001 Slide 7 Complementary DNA Strands Double-Stranded DNA November 16, 2001 Slide 8 RNA Strand U=uracil replaces T= thymine November 16, 2001 Slide 9 Amino Acids • Protein is a large molecule that is a chain of amino acids (100 to 5000). • There are 20 common amino acids (Alanine, Cysteine, …, Tyrosine) • Three bases --- a codon --- suffice to encode an amino acid. • There are also START and STOP codons. November 16, 2001 Slide 10 Genetic Code November 16, 2001 Slide 11 Translation to a Protein Unlike DNA, proteins have three-dimensional structure Protein folds to a three-dimensional shape that minimizes energy November 16, 2001 Slide 12 Cell’s Fetch-Execute Cycle • Stored Program: DNA, chromosomes, genes • Fetch/Decode: RNA, ribosomes • Execute Functions: Proteins --- oxygen transport, cell structures, enzymes • Inputs: Nutrients, environmental signals, external proteins • Outputs: Waste, response proteins, enzymes November 16, 2001 Slide 13 The Language of the New Biology A new language has been created. Words in the language that are useful for today’s talk. Genomics Functional Genomics Proteomics cDNA microarrays Global Gene Expression Patterns November 16, 2001 Slide 14 Genomics •Discovery of genetic sequences and the ordering of those sequences into • individual genes; • gene families; • chromosomes. • Identification of • sequences that code for gene products/proteins; • sequences that act as regulatory elements. November 16, 2001 Slide 15 Genome Sequencing Projects • • • • • • • • Drosophila Yeast Mouse Rat Arabidopsis Human Microbes … November 16, 2001 Slide 16 Drosophila Genome November 16, 2001 Slide 17 Functional Genomics • The biological role of individual genes; • mechanisms underlying the regulation of their expression; • regulatory interactions among them. November 16, 2001 Slide 18 Glycolysis, Citric Acid Cycle, and Related Metabolic Processes November 16, 2001 Slide 19 Gene Expression • Only certain genes are “turned on” at any particular time. • When a gene is transcribed (copied to mRNA), it is said to be expressed. • The mRNA in a cell can be isolated. Its contents give a snapshot of the genes currently being expressed. • Correlating gene expressions with conditions gives hints into the dynamic functioning of the cell. November 16, 2001 Slide 20 Gene Expression: Control Points November 16, 2001 Slide 21 Free Radicals November 16, 2001 Slide 22 Responses to Environmental Signals November 16, 2001 Slide 23 Effects of Drought Stress Virginia Tech: Plant Biologists: Ruth Alscher, Boris Chevone. CS: Lenny Heath, Naren Ramakrishnan, and colleagues. Statistics: Ina Hoeschele, Shun-Hwa Li. NC State (Forest Biotechnology): Ying-Hsuan Sun, Ron Sederoff, Ross Whetten November 16, 2001 Slide 24 Intracellular Decision Making November 16, 2001 Slide 25 Relative Abundance Detection Detection Treatment 1 1 1 Control 1 2 2 3 3 3 3 2 2 Mix Spots: 1 2 (Sequences affixed to slide) November 16, 2001 3 Hybridization 1 2 3 Slide 26 Gene Expression Varies November 16, 2001 Slide 27 Existing Computational Tools in Bioinformatics • • • • • • Sequence similarity Multiple sequence alignments Database searching Evolutionary (phylogenetic) tree construction Sequence assemblers Gene finders November 16, 2001 Slide 28 Challenges for Bioinformatics • Analyzing and synthesizing complex experimental data • Representing and accessing vast quantities of information • Pattern matching • Data mining • Gene discovery • Function discovery • Modeling the dynamics of cell function November 16, 2001 Slide 29 Bioinformatics at Virginia Tech Computer science interacts with the life sciences. • Computer Science in Bioinformatics: • Joint research with: plant biologists, microbial biologists, biochemists, cell-cycle biologists, animal scientists, crop scientists, statisticians. • Projects: Expresso; Nupotato; MURI; Arabidopsis Genome; Barista; Cell-Cycle Modeling • Graduate option in bioinformatics • Virginia Bioinformatics Institute (VBI) November 16, 2001 Slide 30 Expresso: A Problem Solving Environment (PSE) for Microarray Experiment Design and Analysis • Integration of design and procedures • Integration of image analysis tools and statistical analysis • Data mining using inductive logic programming (ILP) • Closing the loop • Integrating models November 16, 2001 Slide 31 Flow of a Microarray Experiment PCR Select cDNAs Replication and Randomization Robotic Printing Hypotheses Identify Spots Intensities Statistics Hybridization Test of Hypotheses Extract RNA November 16, 2001 Clustering Reverse Transcription and Fluorescent Labeling Data Mining, ILP Slide 32 Expresso: A Microarray Experiment Management System November 16, 2001 Slide 33 Nupotato • Potatoes originated in the Andes, where there are many varieties. • Many varieties survive at high altitude in cold, dry conditions. • Microarray technology can be used to investigate genes that are responsible for stress resistance and that are responsible for the production of nutrients. November 16, 2001 Slide 34 MURI • Some microorganisms have the ability to survive drying out or intense radiation. • Their genomes are just being sequenced. • Using microarrays and proteomics, we will try to correlate computationally the genes in the genomes with the special traits of the microorganisms. • We are currently using multiple genome analysis. November 16, 2001 Slide 35 Arabidopsis Genome Project • Arabidopsis is a model higher plant. • It is the first higher plant whose genome has been fully sequenced. • Gene finder software has been used to identify putative genes. • We are computationally mining the regulatory regions of these genes for promoter patterns. November 16, 2001 Slide 36 Barista • Barista serves Expresso! • Software development team across projects to minimize duplication of effort. • Work with Linux, Perl, C, Python, cvs, Apache, PHP, … November 16, 2001 Slide 37 Virginia Bioinformatics Institute (VBI) • Research institute based at Virginia Tech • Established July 1, 2000, with $3 million • Will occupy 2 building and have 100+ employees in 4 years November 16, 2001 Slide 38 Getting Into Bioinformatics • Learn some biology --- genetics, cell biology • Study computational (molecular) biology • Get involved with bioinformatics research in interdisciplinary teams • Work with biologists to solve their problems November 16, 2001 Slide 39