* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download Background - Florida Tech Department of Computer Sciences
Survey
Document related concepts
Transcript
Molecular Biology Background Debasis Mitra Florida Tech Credit: Pevezner text-site 5/4/2017 dmitra 1 Section1: What is Life made of? 5/4/2017 dmitra 2 2 types of cells: Prokaryotes v.s.Eukaryotes 5/4/2017 dmitra 3 Life begins with Cell 5/4/2017 A cell is a smallest structural unit of an organism that is capable of independent functioning All cells have some common features dmitra 4 Prokaryotes and Eukaryotes •According to the most recent evidence, there are three main branches to the tree of life. •Prokaryotes include Archaea (“ancient ones”) and bacteria. •Eukaryotes are kingdom Eukarya and includes plants, animals, fungi and certain algae. 5/4/2017 dmitra 5 Prokaryotes and Eukaryotes, continued Prokaryotes Eukaryotes Single cell Single or multi cell No nucleus Nucleus No organelles Organelles One piece of circular Chromosomes DNA No mRNA post transcriptional modification 5/4/2017 Exons/Introns splicing dmitra 6 Overview of organizations of life 5/4/2017 Nucleus = library Chromosomes = bookshelves Genes = books Almost every cell in an organism contains the same libraries and the same sets of books. Books represent all the information (DNA) that every cell in the body needs so it can grow and carry out its vaious functions. dmitra 7 Chromosomes Organism Number of base pair number of Chromosomes --------------------------------------------------------------------------------------------------------Prokayotic Escherichia coli (bacterium) 4x106 1 Eukaryotic Saccharomyces cerevisiae (yeast) Drosophila melanogaster(insect) Homo sapiens(human) Zea mays(corn) 5/4/2017 1.35x107 1.65x108 2.9x109 5.0x109 dmitra 17 4 23 10 8 Bio-molecules 5/4/2017 Nucleic acids (DNA, RNA): Library of life Proteins: Workhorse of life Fatty acids, carbohydrates, and other supporting molecules dmitra 9 DNA DNA has a double helix structure which composed of sugar molecule phosphate group and a base (A,C,G,T) DNA always reads from 5’ end to 3’ end for transcription replication 5’ ATTTAGGCC 3’ 3’ TAAATCCGG 5’ 5/4/2017 dmitra 10 DNA, RNA, and the Flow of Information Replication Transcription 5/4/2017 Translation dmitra 11 Proteins Functions 5/4/2017 Structural Enzymes Information exchange (e.g., across cell walls) Transporting other molecules (e.g., oxygen to cells) Activating-deactivating genes Etc. dmitra 12 Proteins Amino acids Protein is a chain of “residues” 5/4/2017 20 to 5000 long, typically a few hundred long dmitra 13 Protein structure 5/4/2017 Important for its function Primary structure: sequence Secondary structure: a few topological features Tertiary structure: 3D folding Quaternary structure: Protein complex dmitra 14 Protein Folding Proteins tend to fold into the lowest free energy conformation. Proteins begin to fold while the peptide is still being translated. Proteins bury most of its hydrophobic residues in an interior core to form an α helix. Most proteins take the form of secondary structures α helices and β sheets. Molecular chaperones, hsp60 and hsp 70, work with other proteins to help fold newly synthesized proteins. Much of the protein modifications and folding occurs in the endoplasmic reticulum and mitochondria. 5/4/2017 dmitra 15 Protein Folding (cont’d) The structure that a protein adopts is vital to it’s chemistry Its structure determines which of its amino acids are exposed carry out the protein’s function Its structure also determines what substrates it can react with 5/4/2017 dmitra 16 Nucleic acids Two types: DNA: Deoxy-ribonucleic acid RNA: Ribonucleic acid 5/4/2017 dmitra 17 Nucleic acids 5/4/2017 Sugar molecule chain forms the base of the polymer Two types of sugar: ribose (RNA), 2’-deoxyribose (DNA) dmitra 18 Nucleic acids: DNA 5/4/2017 4 types of bases connected to sugar molecules: Adenine (a), Guanine (g), Thymine (t) and Cytosine (c) A and T forms strong bonds, and so do G and C dmitra 19 An Introduction to Bioinformatics Algorithms The Purines 5/4/2017 www.bioalgorithms.info The Pyrimidines 2015 An Introduction to Bioinformatics Algorithms DNA www.bioalgorithms.info • DNA has a double helix structure which composed of • sugar molecule • phosphate group • and a base (A,C,G,T) • DNA always reads from 5’ end to 3’ end for transcription replication 5’ ATTTAGGCC 3’ 3’ TAAATCCGG 5’ 5/4/2017 dmitra 21 Nucleic acids: DNA 5/4/2017 Double stranded: two strands of sugar molecule-chains Each strand is directed: 5’ to 3’ Attached inside by base-pairings (a-t and g-c) dmitra 22 An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Double helix of DNA 5/4/2017 2315 Discovery of DNA DNA Sequences Chargaff and Vischer, 1949 DNA consisting of A, T, G, C • Adenine, Guanine, Cytosine, Thymine Chargaff Rule Noticing #A#T and #G#C • A “strange but possibly meaningless” phenomenon. Wow!! A Double Helix 5/4/2017 Watson and Crick, Nature, April 25, 1953 1 Biologist 1 Physics Ph.D. Student 900 words Nobel Prize Crick Watson Rich, 1973 Structural biologist at MIT. DNA’s structure in atomic resolution. dmitra 24 Nucleic acids: DNA Each strand is complementary and reverse to the other If s=agacgt reverse(s)=tgcaga reverse-complement(s)=acgtct Double-strand: 5/4/2017 5’--agacgt->3’ 3’<-t ctgca—5’ dmitra 25 Nucleic acids: DNA 3D structure is helical Double-stranded helix: like step ladder Each unit is a base pair (sugar-basebase-sugar) 5/4/2017 DNA’s in cells are chromosomes (human chromosome ~3*(10^9) bp long) Squeezed 3D structure in cell may have functional importance – not well studied dmitra 26 DNA Replication 5/4/2017 dmitra 27 Nucleic acids: RNA 5/4/2017 Replace t with u (uracil) as base May or may not be (mostly not) double stranded Functions: Information storage like DNA, sometimes workhorse like proteins Possible evolutionary precursor to DNA and protein dmitra 28 Genetic code 5/4/2017 Proteins do almost all the works!! Information for coding proteins are stored on DNA’s (or RNA’s): genes Three consecutive bases on a gene codes an amino acid, or the STOP code: codon The table is called genetic code dmitra 29 Cell Information: Instruction book of Life DNA, RNA, and Proteins are examples of strings written in either the four-letter nucleotide of DNA and RNA (A C G T/U) or the twenty-letter amino acid of proteins. Each amino acid is coded by 3 nucleotides called codon. (Leu, Arg, Met, etc.) 5/4/2017 dmitra 30 Overview of DNA to RNA to Protein A gene is expressed in two steps 1) 2) 5/4/2017 Transcription: RNA synthesis Translation: Protein synthesis dmitra 31 An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Central Dogma of Biology The information for making proteins is stored in DNA. There is a process (transcription and translation) by which DNA is converted to protein. By understanding this process and how it is regulated we can make predictions and models of cells. Assembly Protein Sequence Analysis Sequence analysis Gene Finding 5/4/2017 3215 Transcription 5/4/2017 Genes are transcribed to proteins: typically one gene to one protein Genes are subsequenes on chromosomes started by a promoter region, ended around a stop codon dmitra 33 Transcription 5/4/2017 Steps: DNA is split over gene after promoter is recognized (may have other regulatory regions upstream) mRNA is copied from the gene Exons are spliced out from the mRNA keeping the introns only Ribosome (rRNA and protein complex) works on mRNA dmitra 34 Transcription The process of making RNA from DNA Catalyzed by “transcriptase” enzyme Needs a promoter region to begin transcription. ~50 base pairs/second in bacteria, but multiple transcriptions can occur simultaneously http://ghs.gresham.k12.or.us/science/ps/sci/ibbio/chem/nucleic/chpt15/transcription.gif 5/4/2017 dmitra 35 Definition of a Gene Regulatory regions: up to 50 kb upstream of +1 site Exons: protein coding and untranslated regions (UTR) 1 to 178 exons per gene (mean 8.8) 8 bp to 17 kb per exon (mean 145 bp) Introns: splice acceptor and donor sites, junk DNA average 1 kb – 50 kb per intron Gene size: Largest – 2.4 Mb (Dystrophin). Mean – 27 kb. 5/4/2017 dmitra 36 Translation tRNA are attached to codons on mRNA On the other end the tRNA attracts appropriate amino acid Amino acids are zipped up No tRNA for STOP codon Every step is facilitated by appropriate enzyme Central Dogma of biology 5/4/2017 dmitra 37 Translation, continued Catalyzed by Ribosome Using two different sites, the Ribosome continually binds tRNA, joins the amino acids together and moves to the next location along the mRNA ~10 codons/second, but multiple translations can occur simultaneously http://wong.scripps.edu/PIX/ribosome.jpg 5/4/2017 dmitra 38 Revisiting the Central Dogma 5/4/2017 In going from DNA to proteins, there is an intermediate step where mRNA is made from DNA, which then makes protein This known as The Central Dogma Why the intermediate step? DNA is kept in the nucleus, while protein sythesis happens in the cytoplasm, with the help of ribosomes dmitra 39 The Central Dogma (cont’d) 5/4/2017 dmitra 40 Open Reading Frame 5/4/2017 Three reading frames in a strand Complementary strand may have another three frames dmitra 41 Types of chromosomes 5/4/2017 Procaryotes (bacteria, blue algae): circular Eucaryotes (has nuclear wall): diploid (human has 23 pairs) Homologous genes and alleles (e.g., human hemoglobin of type A, B, and O) Haploid chromosomes in Eucaryote sex cells dmitra 42 DNA Sequencing 5/4/2017 A DNA fragment is split at each position starting from one end Four tubes: one containing molecules ending with G, one with A, one with T and another one with C Electrophoresis separates each chunk of different size in each tube [page 22] Information is recombined to sequence the DNA chunk Can be done for the size of only ~1K bp long chunk dmitra 43 DNA Sequencing 5/4/2017 Human DNA is ~10^9 bp long Restriction enzyme cuts at restriction sites (a product of genetic engineering) [page 18] After sequencing, information from fragments need to be recombined to get the broader picture dmitra 44 DNA Sequencing 5/4/2017 Depends on finding restriction site/enzyme for fragmenting DNA of appropriate size Privately funded Tiger project (Celera now) used heat and vibration to create fragments Recombining information is no longer trivial because fragment’s location is no longer known Needed Fragment assembly algorithm dmitra 45 DNA Sequencing 5/4/2017 Needs multiple copies of DNAs Recombinant DNA by biologically copying them within host organisms Polymerase Chain Reaction: heat and tear two strands of DNA, then let each strand attract nucleic acids to form double stranded DNA, repeat dmitra 46