* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Some Biology that Computer Scientists Need for
Genetic code wikipedia , lookup
Non-coding RNA wikipedia , lookup
Biochemistry wikipedia , lookup
Molecular cloning wikipedia , lookup
Protein moonlighting wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Genome evolution wikipedia , lookup
Messenger RNA wikipedia , lookup
Community fingerprinting wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Gene expression profiling wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Epitranscriptome wikipedia , lookup
Non-coding DNA wikipedia , lookup
Two-hybrid screening wikipedia , lookup
List of types of proteins wikipedia , lookup
Point mutation wikipedia , lookup
Gene regulatory network wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Molecular evolution wikipedia , lookup
Gene expression wikipedia , lookup
Some Biology That Computer Scientists Need for Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA 24061 [email protected] University of Maryland December 14, 2001 December 14, 2001 Slide 1 Overview I. Some Molecular Biology and Genomics II. Language of the New Biology III. Existing bioinformatics tools IV. Bioinformatics challenges V. Bioinformatics at Virginia Tech December 14, 2001 Slide 2 I. Some Molecular Biology • The instruction set for a cell is contained in its chromosomes. • Each chromosome is a long molecule called DNA. • Each DNA molecule contains 100s or 1000s of genes. • Each gene encodes a protein. • A gene is transcribed to mRNA in the nucleus. • An mRNA is translated to a protein on ribosomes. December 14, 2001 Slide 3 Transcription and Translation Transcription DNA December 14, 2001 Translation mRNA Protein Slide 4 Elaborating Cellular Function Regulation Degradation Transcription DNA Translation mRNA (Genetic Code) Protein Reverse Transcription Thousands of Genes! December 14, 2001 Functions: • Structure • Catalyze chemical reactions • Respond to environment Slide 5 Chromosomes • Long molecules of DNA: 10^4 to 10^8 base pairs • 26 matched pairs in humans • A gene is a subsequence of a chromosome that encodes a protein. • Proteins associated with cell function, structure, and regulation. • Only a fraction of the genes are in use at any time. • Every gene is present in every cell. December 14, 2001 Slide 6 DNA Strand 2’-deoxyribose (sugar) 5’ End C A C T T T A G A G 3’ End C G Bases A (adenine) complements T (thymine) C (cytosine) complements G (guanine) December 14, 2001 Slide 7 Complementary DNA Strands C A C T T T A G A G C G G T G A A A T C T C G C C A C T T T A G A G C G G T G A A A T C T C G C Double-Stranded DNA December 14, 2001 Slide 8 RNA Strand Ribose (sugar) 5’ End C A C U U U A G A 3’ End G C G Bases U (uracil) replaces T (thymine) December 14, 2001 Slide 9 Transcription of DNA to mRNA Coding DNA Strand C A C T T T A G A G C G G T G A A A T C T C G C Template DNA Strand mRNA Strand C A C U U U A G A G C G G T G A A A T C T C G C December 14, 2001 Template DNA Strand Slide 10 Proteins and Amino Acids • Protein is a large molecule that is a chain of amino acids (100 to 5000). • There are 20 common amino acids (Alanine, Cysteine, …, Tyrosine) • Three bases --- a codon --- suffice to encode an amino acid. • There are also START and STOP codons. December 14, 2001 Slide 11 Genetic Code December 14, 2001 Slide 12 Translation to a Protein mRNA Strand C A C Histidine U U U Phenylalanine A G A Arginine G C G Alanine Nascent Polypeptide: Amino Acids Bound Together by Peptide Bonds Unlike DNA, proteins have three-dimensional structure essential to protein function. Protein folds to a three-dimensional shape that cannot yet be predicted from the primary sequence. December 14, 2001 Slide 13 Transcription and Translation Transcription DNA December 14, 2001 Translation mRNA Protein Slide 14 Transcription of DNA to mRNA Coding DNA Strand C A C T T T A G A G C G G T G A A A T C T C G C Template DNA Strand mRNA Strand C A C U U U A G A G C G G T G A A A T C T C G C December 14, 2001 Template DNA Strand Slide 15 Translation to a Protein mRNA Strand C A C Histidine U U U Phenylalanine A G A Arginine G C G Alanine Nascent Polypeptide: Amino Acids Bound Together by Peptide Bonds December 14, 2001 Slide 16 Cell’s Fetch-Execute Cycle • Stored Program: DNA, chromosomes, genes • Fetch/Decode: RNA, ribosomes • Execute Functions: Proteins --- oxygen transport, cell structures, enzymes • Inputs: Nutrients, environmental signals, external proteins • Outputs: Waste, response proteins, enzymes December 14, 2001 Slide 17 II. The Language of the New Biology A new language has been created. Words in the language that are useful for today’s talks. Genomics Functional Genomics Proteomics cDNA Microarrays Global Gene Expression Patterns December 14, 2001 Slide 18 Genomics • Discovery of genetic sequences and the ordering of those sequences into • individual genes; • gene families; • chromosomes. • Identification of • sequences that code for gene products/proteins; • sequences that act as regulatory elements. December 14, 2001 Slide 19 Genome Sequencing Projects • • • • • • • • Drosophila Yeast Mouse Rat Arabidopsis Human Microbes … December 14, 2001 Slide 20 Drosophila Genome December 14, 2001 Slide 21 Functional Genomics • The biological role of individual genes. • Mechanisms underlying the regulation of their expression. • Regulatory interactions among them . December 14, 2001 Slide 22 Glycolysis, Citric Acid Cycle, and Related Metabolic Processes December 14, 2001 Slide 23 Gene Expression • Only certain genes are “turned on” at any particular time. • When a gene is transcribed (copied to mRNA), it is said to be expressed. • The mRNA in a cell can be isolated. Its contents give a snapshot of the genes currently being expressed. • Correlating gene expressions with conditions gives hints into the dynamic functioning of the cell. December 14, 2001 Slide 24 Responses to Environmental Signals December 14, 2001 Slide 26 Intracellular Decision Making December 14, 2001 Slide 27 Microarray Technology • In the past, gene expression and gene interactions were examined known gene by known gene, process by process. • With microarray technology: – Simultaneous examination of large groups of genes and associated interactions – Possible discovery of new cellular mechanisms involving gene expression December 14, 2001 Slide 28 Flow of a Microarray Experiment PCR Select cDNAs Replication and Randomization Robotic Printing Hypotheses Identify Spots Intensities Statistics Hybridization Test of Hypotheses Extract RNA December 14, 2001 Clustering Reverse Transcription and Fluorescent Labeling Data Mining, ILP Slide 29 Relative Abundance Detection Detection Treatment 1 1 1 Control 1 2 2 3 3 3 3 2 2 Mix Spots: 1 2 (Sequences affixed to slide) 1 3 2 3 Hybridization December 14, 2001 Slide 30 Gene Expression Varies Cy5 to Cy3 ratios December 14, 2001 Slide 31 III. Existing Computational Tools in Bioinformatics • • • • • • Sequence similarity Multiple sequence alignments Database searching Evolutionary (phylogenetic) tree construction Sequence assemblers Gene finders December 14, 2001 Slide 32 Existing Biological Databases • Molecular Sequences: Genomic DNA, mRNA, ESTs, proteins • Protein domains, motifs, or blocks • Protein families • Genomes • Nomenclature and ontologies • Biological literature December 14, 2001 Slide 33 IV. Challenges for Bioinformatics • Analyzing and synthesizing complex experimental data • Representing and accessing vast quantities of information • Pattern matching • Data mining --- whole genome analysis • Gene discovery • Function discovery • Modeling the dynamics of cell function December 14, 2001 Slide 34 V. Bioinformatics at Virginia Tech Computer science interacts with the life sciences. • Computer Science in Bioinformatics: • Joint research with: plant biologists, microbial biologists, biochemists, cell-cycle biologists, animal scientists, crop scientists, statisticians. • Projects: Expresso; Nupotato; MURI; Arabidopsis Genome; Barista; Cell-Cycle Modeling • Graduate option in bioinformatics • Virginia Bioinformatics Institute (VBI) December 14, 2001 Slide 35 Expresso: A Problem Solving Environment (PSE) for Microarray Experiment Design and Analysis • Integration of design and procedures • Integration of image analysis tools and statistical analysis • Data mining using inductive logic programming (ILP) • Closing the loop • Integrating models December 14, 2001 Slide 36 Getting Into Bioinformatics • Learn some biology --- genetics, cell biology • Study computational (molecular) biology • Get involved with bioinformatics research in interdisciplinary teams • Work with biologists to solve their problems December 14, 2001 Slide 42