* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Central Dogma of Genetics
Polyadenylation wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Expression vector wikipedia , lookup
Molecular cloning wikipedia , lookup
Genomic library wikipedia , lookup
Gene regulatory network wikipedia , lookup
Community fingerprinting wikipedia , lookup
Messenger RNA wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Genetic code wikipedia , lookup
DNA supercoil wikipedia , lookup
RNA silencing wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
RNA polymerase II holoenzyme wikipedia , lookup
Biochemistry wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Point mutation wikipedia , lookup
Eukaryotic transcription wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Non-coding DNA wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Biosynthesis wikipedia , lookup
Epitranscriptome wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Gene expression wikipedia , lookup
GENOMES & GENOME EVOLUTION Genomes and Genome Evolution - BIOL 4301/6301 What to expect and some suggestions. – I like think of myself as fair but reasonably tough • I want people to do well but I’m not willing to compromise on the material or ethical guidelines to make it happen. – There is no extra credit. This is non-negotiable. • Study for the exams and do well on them. – Ask questions IN CLASS • • • • • Makes things more interesting for me Others probably have the same question You’re paying, get your money’s worth Interactions with other humans tends to wake people up Office hours!!!!!!!!!! I have them. Take advantage. – I am an evolutionary biologist. This class is taught from an evolutionary perspective. Genomes and Genome Evolution - BIOL 4301/6301 What to expect and some suggestions. – Absorb and critique anything related to the subject. This includes but is not restricted to: – – Popular news articles, TV shows (CSI, Bones, etc.), textbooks, wikipedia, etc. Genomics is everywhere. Bring in what you find for discussion. – Website - http://www.myweb.ttu.edu/daray/Teaching.htm – – – – – Username & password Again, ask questions during class Ask questions DURING CLASS Did I mention that you should ask questions during class? You WILL see pictures of my adorable children. This is also non-negotiable. Course Objectives and Assumptions • Objectives: By the end of this course you should be able to… • describe the methods and principle of modern genome analysis • describe the components and structure of viral, prokaryotic and eukaryotic genomes • explain the basic techniques of genome sequencing and analysis • describe the way genomes change over time • apply principles of genomics to modern biological questions • explain the outcomes of a variety of genome projects Course Objectives and Assumptions • Assumptions: I am assuming that you… • have a working knowledge of Mendelian genetics • have a working knowledge of DNA, RNA and proteins • understand the basic differences between eukaryotes and prokaryotes • have a basic understanding of the concept of a gene • have a working knowledge of the ‘central dogma’ of Biology • give a rat’s behind about learning this stuff • Have considered enrolling in Bioinformatics. While not required, it would be a good idea to take Caleb Phillips’ course concurrently “No course should ever be taught the first time” UNIT 1 FUNDAMENTAL CONCEPTS The biggest failure of science education is… • Most people can’t discriminate between what is scientific and what is not scientific. • This is due, in part, to the fact that definitions of science tend to be fairly nebulous. • Moreover, any moron can get a Ph.D. Science • A method for discovering how the world around us works • Assumes that all things can be explained by natural processes • Does not allow supernatural explanations • Why? • Rooted in hypothesis formation, observation, testing, and constant re-examination of evidence • Hypotheses MUST be abandoned if they are not supported by evidence • The scientific community is intensely critical of its own ideas and the ideas of others. The advantage of this isn’t that mistakes aren’t made, its that this method pretty much guarantees that mistakes are caught quickly. Science • Step 1. Propose as many ideas as you can think of to explain a phenomenon then pick one or several. • Step 2. Try to disprove it/them. • Step 3. Allow others to try and disprove it/them. • Basic philosophy - Ideas that survive this process are more likely to reflect the real world than ideas that don’t. Other belief systems Religion • A way of “knowing” that is not rooted in scientific principles, but rather is based upon alternate philosophies, mythologies, etc. Most religions have some supernatural aspects. Many religions are opposed to critical inquiry of the beliefs professed. Pseudoscience • Any non-scientific belief system that uses scientific jargon in an attempt to give it scientific credence. Again, criticism of the concepts is often discouraged. Ways of thinking • Of these ways of thinking, science is the “new kid on the block” • Science is a relatively new invention (arguably only a few hundred years old, if that) • But think of all the progress that’s been made in those few hundred years because of scientific thinking UNIT 1 FUNDAMENTAL BIOLOGICAL CONCEPTS Genome • Definition depends upon organism, organelle, or virus one is • • • • talking about Generic definition: Minimum DNA complement that define an organism/organelle/virus Organelles are not, in and of themselves, living creatures. Thus something can have a genome and not be “alive.” Viruses may or may not be alive, depending upon how one defines life The dead have genomes too. Things with genomes • Prokaryotes • Monera (bacteria) • Archaea • Mitochondria • Chloroplasts • Viruses • Eukaryotes • Animals • Plants • Fungi • Protists Things without genomes • Dirt • Rocks • Water • Air • Fire • But even these things may be contaminated with genomic DNA (…well, maybe not fire) What genomes can and can’t do • A genome constrains but does not dictate the features of an organism • Environmental impacts • Toxins, exercise, exposure to disease • Epigenetic impacts • If someone were to clone Hitler…? Genomics • Study of genomes? • Research in which robotics, automated sequencing, and advanced computational methods are utilized to rapidly and efficiently characterize genomes and their components The Central Dogma • DNA RNA Protein • Generally unidirectional Nucleic Acids • Ribonucleic acid (RNA) and deoxyribonucleic acid (DNA) • Composed of chains of nucleotides (ribonucleotides for RNA, deoxyribonucleotides for DNA) Nucleic Acids • Deoxyribonucleic acid • A polymer of nucleotides linked by phosphodiester bonds Nucleic Acids – Purine vs. pyrimidine – Carbon positions Nucleic Acids • Deoxyribonucleic acid • Antiparallel strands held together by hydrogen bonds • Strands are complementary DNA in 3D Scanning-tunneling electron micrograph Pretty uncanny resemblance, don’t you think? Nucleic Acids • Deoxyribonucleic acid can denature, renature & hybridize • Denaturation – separation of the double helix by the addition of heat or chemicals • Renaturation – the reformation of double stranded DNA from denatured DNA • The rate at which a particular sequence will reassociate is proportional to the number of times it is found in the genome • Given enough time, nearly all of the DNA in a heat denatured DNA sample will renature. Nucleic Acids • • • • Ribonucleic acid Ribose vs. deoxyribose Thymine = 5 methyl-uracil Usually single stranded Nucleic Acids • Intramolecular basepairing • Enhanced base-pairing capacity due to G:U bonding • Hairpins • Bulges • Loops • Stem-loop structures • Pseudoknots Nucleic Acids • Complex tertiary structures • Much more flexible than DNA • Capable of triple bonds and base-backbone interactions • Often ‘molded’ by proteins and snoRNPs • Leads to complex 3° structures with catalytic capability - ribozymes Nucleic Acids NB DNA RNA P NB OH O C OH O NB OH O P O C O OH OH P O C OH RNA World • RNAs can have complex 3D structures • They can store genetic information • Some RNAs known as ribozymes can catalyze reactions • Thus it has been hypothesized that life may have arisen first through RNA with protein and DNA being integrated later Replication • DNA is replicated in a semi-conservative fashion, i.e., each daughter molecule is composed of one strand of the original molecule and one newly synthesized strand. • DNA polymerase is the enzyme that catalyzes synthesis of new strands out of dNTPs. Replication: Key points • DNA polymerase cannot generate a new strand • • • • • • without a 3’ OH on which to add a nucleotide. Primers are required. New strands generated from 5’ to 3’. Replication is bidirectional. Replication forks proceed from an initiation site in both directions. Multiple sites of initiation are found along a chromosome. Initiation sites are often AT rich as AT base pairs are less stable and thus come apart more easily. Okazaki fragments are generated along lagging strand. http://www.johnkyrk.com/DNAreplication.html http://www.dnalc.org/resources/3d/04-mechanismof-replication-advanced.html RNA • Normally single-stranded • Generated from NTPs by RNA polymerase using DNA as a template (transcription) • As with DNA replication, new strand assembled in 5’ to 3’ direction by phosphodiester bond formation • RNA is inherently less stable than DNA Major types of RNA • Messenger RNA (mRNA) – carries genetic instructions (coded in DNA) from the nucleus into the cytoplasm. mRNA molecules are often called transcripts. • Ribosomal RNA (rRNA) – a structural component of ribosomes (the complexes that are involved in assembling proteins based upon information in mRNA templates) • Transfer RNA (tRNA) – acts as carrier of amino acids during protein assembly • Regulatory RNAs – Many groups; miRNAs, siRNAs, CRISPR RNAs, antisense RNAs, long non-coding RNAs Transcription • Generation of an RNA strand from a DNA template • Much of the control over cell development comes at the transcriptional level – All somatic cells have same DNA but can differ tremendously in morphology and function • Differential gene expression Transcription: Key points • Transcription starts at the promoter, a site along the DNA • • • • • • molecule where RNA polymerase binds. RNA polymerase is recruited to the promoter by transcription factors. New strand generated from 5’ to 3’. Only one of the two DNA strands serves as a template (antisense strand). The other strand (sense strand) has the same sequence as the mRNA molecule except dTMPs have been substituted with UMPs. Which stand is used as a template differs between genes. After transcription, mRNA undergoes post-transcriptional modifications. Generally, a methyl-guanosine cap is added to the 5’ end and a tail of adenosine nucleotides (poly-A tail) is added to the 3’ end. In eukaryotes, the mRNA undergoes post-transcriptional splicing – introns are removed and exons are spliced together. Transcription models • http://www.johnkyrk.com/DNAtranscription.html • http://www.dnalc.org/resources/3d/13-transcription- advanced.html A few definitions • Precursor mRNA (pre-mRNA) or heterogeneous nuclear RNA (hnRNA): mRNA immediately after transcription and before posttranscriptional modification • Mature mRNA (or simply mRNA): Transcript after post-transcriptional modifications. • cDNA (complementary DNA): A DNA molecule generated in a reaction catalyzed by reverse transcriptase using mature mRNA as the template. rRNA • Associated with proteins to form ribosomes • Several different rRNAs • Genes that code for rRNA are typically referred to as rDNA sequences • rDNA sequences found in more or less tandem repeats in genome tRNA • tRNA molecules deliver amino acids to ribosomes during • • • • • • protein synthesis (translation) tRNAs have considerable secondary structure due to base pairing Clover leaf 2D structure L-shaped 3D structure There are more than 20 tRNAs (i.e., there is some redundancy) tRNA structure is highly conserved (e.g., human tRNAs can function in yeast) http://www.myweb.ttu.edu/daray/Genomes/ribosome/riboso me/ribosome_jmol_play.html Amino acids • Proteins are made of chains of amino acids • There are 20 amino acids utilized by biological • • • • systems Each codon in mRNA represents an amino acid or a start/stop signal Amino acids can be acidic (net negative charge), basic (positive charge), uncharged polar (ends have different net charges), and non-polar. Uncharged polar, acidic, and basic amino acids tend to be hydrophilic and thus are often found on the outside of proteins. Non-polar amino acids tend to be hydrophobic and thus are clustered in the middle of proteins. Genetic code Formation of a peptide bond • At physiological pH (7.0), both the amino and carboxyl groups are ionized. • The peptidyl transferase ribozyme catalyzes the formation of peptide bonds with the concomitant release of a water molecule. Translation • Construction of an amino acid chain (protein) by a ribosome based upon the nucleotide sequence of a mRNA molecule • While there are minor differences between eukaryotic and prokaryotic translation processes, most steps in translation are well conserved. http://www.johnkyrk.com/DNAtranslation.html Spatial separation of transcription and translation is seen in eukaryotes, not prokaryotes What is a gene? • How do we identify a gene? • A priori methods – • recognize sequence patterns within expressed genes and the regions • • • • flanking them Distinctive patterns of codon statistics (most obviously, a reduced frequency of stop codons) Proximity of start codon and known promoter sites GT/AG pairs in exons Codon usage statistics can be ‘typical’ of genes in an organism • Use a set of known genes to identify regions with similar codon usage stats • ‘Been there, seen that’ methods – • Recognize regions corresponding to previously characterized genes. • The changing definition of a ‘gene’ The structure of a typical coding gene Genes vs. alleles vs. loci • Gene: “Region of DNA that controls a discrete hereditary characteristic, often (but not always) corresponding to a single protein or RNA. This definition includes the entire functional unit, encompassing coding DNA sequences, non-coding regulatory DNA sequences, and introns.” • Allele: “One of a set of alternative forms of a gene.” • Locus: “The position of a gene on a chromosome. Different alleles of the same gene all occupy the same locus.” • Definitions from Alberts et al. (1994) Recombination • Protein-mediated (1) exchange of a DNA region between two different DNA molecules OR (2) replacement of a DNA region in one molecule by DNA from another • Almost always requires at least some homology between sequences involved Recombination • Non-homologous recombination • Duplication/deletion Recombination • Gene Conversion • Non-crossover recombination – replacement of one allele with an alternative • Function and impacts • Regulation of gene expression • Homogenization of genome sequence • 21-hydroxylase – 95% of pathogenic mutations arise by gene conversion of neighboring pseudogene Expression patterns • There are ~23,000 protein coding human genes, which can give rise to a minimal protein set • No single cell needs to express all of those proteins • Ex. - Lac operon in bacteria, insulin in humans • Or may need alternate versions of them • Alternative splicing • The amount of a protein must also be regulated • Overexpression of a single gene rarely causes disease but, • Lack of expression of a single gene can cause major problems Expression patterns Xeroderma pigmentosa (XP) 7 distinct types, all caused by deficient NER system Extreme sensitivity to sunlight, high incidence of skin cancer DNA repair enzyme containing creams help Transcriptional regulation • Most regulation takes place at the transcription level • Simple in prokaryotes - Repressors, activators, the lac operon Transcriptional regulation • • • • • • • • • The lac operon Leaky control of lacZ Allolactose version of lactose actually metabolized Allolactose acts as a ligand that turns on transcription (deactivates repressor) Lactose converted to allolactose using β-galactosidase How, if lacZ turned off? Leaky genes Every once in a while, RNA pol slips into place on the promoter in place of repressor Constitutive low level expression Protein activity regulation • • • • Protein turnover Chemical modification Inhibition Allostery Transcriptional regulation • • • • The lac operon regulation Lactose+, glucose- environment Allolactose acts as the ligand that turns on transcription • Allolactose binds to lac repressor to allosterically disable binding to operator cAMP levels in cell inversely related to glucose levels • Low glucose = high cAMP • cAMP allosterically activates CAP +lactose +allolactose Allosteric binding to lac repressor +lacZ expression +lactose metabolism -glucose +cAMP Allosteric binding to CAP +lacZ expression +lactose metabolism Transcriptional regulation • • The lac operon regulation Lactose-, glucose+ environment -lactose -allolactose No allosteric binding to lac repressor lacZ repression -lactose metabolism +glucose -cAMP No allosteric binding to CAP -lacZ expression -lactose metabolism Transcriptional regulation • • • The lac operon regulation Lactose+, glucose+ environment Repressor not inhibited but expression not increased by CAP +lactose +allolactose Allosteric binding to lac repressor + lacZ expression +lactose metabolism +glucose -cAMP No allosteric binding to CAP No activation of CAP No increased lacZ expression (basal metabolism) Transcriptional regulation • Most regulation takes place at the transcription level • Much more complex in eukaryotes Transcriptional regulation • Most regulation takes place at the transcription level • Increased complexity in eukaryotes - β-globin regulation Transcriptional regulation • • • • Gene silencing Imprinting – selective expression of one parental allele Neighboring genes, Igf2 and H19, are on and off depending on parental source • Igf2 – Insulin-like growth factor 2 • Highly active during fetal development • H19 – a non-coding RNA • May act as a tumor suppressor What is involved in this regulation? • Downstream enhancer • CTCF – regulatory protein • ICR – imprinting control region Transcriptional regulation • • • • Gene silencing Activators bound to enhancer could potentially activate both genes Maternal chromosome is unmethylated in this region • Lack of methylation allows binding of CTCF to ICR • CTCF blocks activation of Igf2 • … allows activation of H19 Paternal chromosome is methylated in this region • Methylation blocks binding of ICR • … blocks activation of H19 via MeCP2 Transcriptional regulation • • • • • • Gene silencing Beckwith-Wiedemann syndrome (BWS) ~1/15,000 births Increased risk of cancer (Wilms’ tumor) Hemihypertrophy Improper imprinting • Biallelic expression of Igf2 • No expression of H19 Translational regulation • • • • • RNA interference Ligand binding to Shine-Delgarno RNA lifespan Alternative splicing tRNA availability/codon usage Supplemental review • Review material to brush up on these subjects is available on the course website • Structural tutorials • Walkthroughs of DNA synthesis, DNA replication, Transcription, Translation, Recombination, etc.