* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Gene!
Zinc finger nuclease wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Molecular cloning wikipedia , lookup
Transformation (genetics) wikipedia , lookup
DNA supercoil wikipedia , lookup
Gene therapy wikipedia , lookup
Gene desert wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Gene expression wikipedia , lookup
Gene regulatory network wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Point mutation wikipedia , lookup
Transposable element wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genetic engineering wikipedia , lookup
Genomic library wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Community fingerprinting wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Non-coding DNA wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Molecular evolution wikipedia , lookup
Bioinformatics Practical for Biochemists ! Andrei Lupas, Birte Höcker, Steffen Schmidt WS 2013/2014 ! 01. DNA & Genomics 1 Description • Lectures about general topics in Bioinformatics & History • Tutorials will provide you with a toolbox of bioinformatics programs to analyse data • Hands-On sessions will give you the opportunity to use these tools 2 Course Outline • • • • • Mon – DNA & Genomics Tue – Introduction to Proteins Wed – Annotation of Sequence Features Thr – Protein Classification Fri – Evolution & Design Course Material: eb.mpg.de/research/departments/protein-evolution/teaching 3 Course Outline • • 13:00-14:00 14:15-17:30 Presentation Tutorial (2 x 30min) & hands-on practical ! • You will need to keep an electronic lab notebook • Fri afternoon: Test Exercises 4 Software Requirements • • • Browser (e.g. Firefox) “Advanced” Word Processor PyMOL (www.pymol.org – free for teaching) 5 DNA & Genomics 1953 Model of DNA (F. Crick) 6 What is the “genetic material”? • 1865 Gregor Mendel • • 1869 Friedrich Miescher • • discovery of ‘nuclein’ (DNA), Hoppe-Seyler repeated all experiments 1881 Edward Zacharias • • basic rules of heredity chromosomes are composed of nuclein 1899 Richard Altmann • renaming nuclein to nucleic acid wikipedia.org 7 DNA is the “transforming material” • 1928 Frederick Griffith • • “transforming principle” - Str. pneumoniae experiment 1944 Avery & McCarty • Griffith’s “transforming principle” is DNA history.nih.gov / wikipedia.org 8 DNA is the genetic material • 1950 Erwin Chargaff • A/T, C/G same amount in different tissues ! ! • 1952 Hershey & Chase • DNA is the genetic material using 32P/35S Phage/E. coli experiment bacteriophagetherapy.info / www.lifesciencesfoundation.org 9 Solving the DNA structure • 1952/53 Linus Pauling • beat Cavendish Lab in discovery of α-helix • Cavendish Lab (Cambridge) Watson & Crick allowed to work full-time on DNA ! ! • Pauling shared manuscript with Cavendish Lab before publication (via his son Peter Pauling) http://osulibrary.oregonstate.edu/specialcollections/coll/pauling/dna/notes/1952a.22-ms-01.html 10 Solving the DNA structure • 1951/1952 Franklin & Wilkins • 1951 Lecture with Watson attending • • • A-DNA / B-DNA periodicity, phosphates are outside 1953 X-ray of B-DNA (Photo 51) - Wilkins showed image to Watson - Perutz showed a confidential committee report to Watson & Crick 11 ature.com/nature Solving the DNA structure Nature, 1953 © 2003 Nature Publishing Group 12 397 DNA structure 13 Getting the “code” • 1953 George E. Palade • • “RNA organelles” (ribosomes) 1957 Crick et.al • • • suggest non-overlapping triplets only 20 out of 64 triplet code for an amino acid “comma-free code” 14 (d) The code is probably ‘degenerate’; that is, in general, one particular ammo-acid can be coded by one of several tripieta of bases. The Reading ofthe the Code“code” Getting ‘report hers our work ,on the mutant P 13 (now renamed FC 0) in the Bl segment of the B cistron. Thie mutant was originally produced by the action of proflavins. We@ have previously argued that acridines such aa pro5vin act as mutagens because they add or dslsts a base or bases. The most striking evidence in favour of this is that mutants produced by a&dines are seldom ‘leaky’ ; they are almost always completely Since our note lacking in the function of the gene. was published, experimental data from two eourcsa have been added to 0u.1: previous evidence: (1) we have examined a set of 126 pn mutants made with polyF acridine protein yellow; of these only 6 are IeaLT- (typically about half the mutants made with base analogues are leaky) ; (2) Streisinger lo has found that whereas mutants of the lysozyme of phage T4 produced by all lysozyme baas-analogues are usually leaky, mutants produced by proflavin are negative, that is, the function is completely lacking. If an acridine mutant i,3 produced by, say, adding a base, it should revert to ‘lvild-type’ by deleting a bass. Our work on revertants of FC-0 shows that it-usually The evidence that the genetic cods is not overlapping (see Fig. 1) doss not come from our work. but from that, of Wittmannl and of Tsugita and Frasnkel-Conrat on the mutants of tobacco mosaic virus produced by nitrous asid. In an overlapping triplet code, an alteration to one baas will in general change three adjacent amino-acids in the polypeptide produces chain. Their work on the polyU alterationsmRNA produced in the protein of the virus show that usually only one amino-acid at a time is changed a8 a result of treating complete genetic code the ribonuclsic acid (RNA) of the virus with nitrous acid. In the rarer cases where two amino-acids are altered (owing presumably to ! two separate deammations by the nitrous acid on one piece of RNA), the altered amino-acids ars not in adjacent positions in the polypeptide chain. Brsnnera had previously shown that, if the code were universal (that is, the same throughout Nature), then all overlapping triplet codes were impossible. no overlapping codes Starlinq point Moreover, all the abnormal human hremoglobins 3 ,, ;$I Overlappirq code studied in detail4 show only single amino-acid changes. The newer experimental rssulta ssssntially rule out concept of mRNA +7 all simple codes of the overlapping type. NUCLEIC ACID * I’ ’ ’ ’ ’ ’ ’ --If the code is not overlapping, then there must be ,-J+-~---triplet Code Borne arrangement to show how to select the correct ETC. 1 triplets (or quadruplets, or whatever it may be) along (Crick, Brenner, Barnett, 3 ' the continuous sequence of bases. One obvious Non-overlapplnq Code Watts-Tobin) suggestion is that, say, every fourth baas is a ‘comma’. Fig. 1. To show the difference between an overlapping code and &other idea is that certain triplets make ‘sense’, a non-overlappinu code. The short wrticnl lines represent the whereas others make ‘nonsense’, as in the comma-free bases of the nucleic acid. The czw illustrated is for a triplet code • 1961 Nirenberg & Matthaei • • • 1961 Sydney Brenner • • • 15 Getting the “code” – incl. start & stop codons • Alternative start codon • • • AUG (83%) GUG (14%) UUG (3%) ! • Alternative stops • • • UAA (63%, ‘ochre’) UGA (29% ‘opal’) / or Sec (Seleoncys) UAG (8%, ‘amber’) E. coli 16 Gene Structure • 1977 Sharp & Roberts • • 1982 Cech • • pre-mRNA is processed ribo(nucleic en)zymes 1980 Joan A. Steitz • role of snRNPs in splicing wikipedia.org / yale.edu 17 Gene Structure – Eurkayotes / Prokaryotes lac Operon 1: Regulatory gene 3: ß-galactosidase 4: ß-gal permease 8: ß-gal transacetylase Promotor region 18 Gene Structure – Polysomes in Prokaryotes • EM picture of polysomes on a chromosome mRNA with Ribosomes Transcription DNA initiation Miller, O. L. et al. Visualization of bacterial genes in action. Science 169, 392–395 19 Gene Structure – Prokayotes u-tokyo.ac.jp 20 Gene Structure – Prokaryotic Operons lac Operon 1: Regulatory gene 3: ß-galactosidase 4: ß-gal permease 8: ß-gal transacetylase Promotor region Griswold, A. (2008) Nature Education 1(1) Understanding Bioinformatics, Zvelebil & Baum, 2007 21 Gene Structure – Eukaryotes / Prokaryotes lac Operon 1: Regulatory gene 3: ß-galactosidase 4: ß-gal permease 8: ß-gal transacetylase Promotor region 22 Gene Structure – Eukaryotes zazzle.com 23 00,000 Gene Structure – Gene density in Eukaryotes 10 Mb 20,000,000 25,000,000 RefSeq Genes hg19 30,000,000 100 vertebrates Basewise Conservation by PhyloP Repeating Elements by RepeatMasker zoom in to <= 10,000,000 bases to view items 24 35,000,000 Gene Structure – Comparison Eukaryote! ! Prokaryote! • Often&have&introns& • Intraspecific&gene&order&and&number& generally&relatively&stable&& Genes! • many&non8coding&(RNA)&genes& • There&is&NOT&generally&a&relationship& between&organism&complexity&and&gene& number& Gene!regulation! • Promoters,&often&with&distal&long&range& enhancers/silencers,&MARS,&transcriptional& domains& • Generally&mono8cistronic& Repetitive!sequences! Organelle! (subgenomes)! • No&introns& • Gene&order&and&number&may& vary&between&strains&of&a&species& • Promoters& • Enhancers/silencers&rare&& • Genes&often®ulated&as& polycistronic&operons& • Generally&highly&repetitive&with&genome&wide& • Generally&few&repeated& sequences& families&from&transposable&element& propagation& • Relatively&few&transposons& • Mitochondrial&(all)& • Absent& • chloroplasts&(in&plants)& 25 Genomic era • 1975 Frederick Sanger • • • dideoxy sequencing 1986 Human Genome Initiative Genomes • • • • • 1995 H. influenca 1.8 Mb 1.7k genes 1997 E. coli 4.6 Mb 4.3k genes 1996 S. cerevisiae 12.5 Mb 5.7k genes 1998 C. elegans 100 Mb 21.7k genes 2000 D. melanogaster 121 Mb 17k genes 26 Prokaryotic Genome • E. coli • • 6 Mbp 1 by 2 µm cell size Kavanoff, Nature Education : Supercoiled chromosome of E. coli. 27 The human genome • 2001 Draft H. sapiens 2.9 Bb 20-30k genes Science (2001), Nature (2001) 28 The human genome 29 Gene content 30 Genome Structure – Comparison Eukaryote! ! Prokaryote! • Large&(10&Mb&–&100,000&Mb)& Size! Content! • There&is¬&generally&a& relationship&between&organism& complexity&and&its&genome&size& (many&plants&have&larger& genomes&than&human!)& • Most&DNA&is&nonLcoding& • Complexity&(as&measured&by&#&of&genes& and&metabolism)&generally&proportional& to&genome&size& • DNA&is&“coding&gene&dense”& • Circular&DNA,&doesn't&need&telomeres& Telomeres/! Centromeres! • Present&(Linear&DNA)& Number!of! chromosomes! • More&than&one,&(often)&including& those&discriminating&sexual& identity& Chromatin! • Generally&small&(<10&Mb;&most&<&5Mb)& • Don’t&have&mitosis,&hence,&no& centromeres.& • Often&one,&sometimes&more,&Lbut& plasmids,¬&true&chromosome.& • Histone&bound&(which&serves&as&a& • No&histones& genome®ulation&point)& • Uses&supercoiling&to&pack&genome& & 31 Gene content 32 Human Genome Content LTR retrotransposons DNA transposons Simple sequence 8.3% repeats 2.9% 3% Segmental duplications 5% Miscellaneous heterochromatin SINEs 13.1% 20.4% 8% LINEs 1.5% 11.6% Miscellaneous unique sequences 25.9% Protein-coding genes Introns Gregory (2005), Nature 33 Gene Structure – Eukaryotic Gene Scale chr1: SMG5 4_ hg19 10 kb 156,225,000 156,230,000 156,235,000 156,240,000 156,245,000 156,250,000 UCSC Genes (RefSeq, UniProt, CCDS, Rfam, tRNAs & Comparative Genomics) Placental Mammal Basewise Conservation by PhyloP Mammal Cons -4 _ Simple Nucleotide Polymorphisms (dbSNP 135) Found in >= 1% of Samples Common SNPs(135) Repeating Elements by RepeatMasker RepeatMasker 34 Human Genome Content LTR retrotransposons DNA transposons Simple sequence 8.3% repeats 2.9% 3% Segmental duplications 5% Miscellaneous heterochromatin SINEs 13.1% 20.4% 8% LINEs 1.5% 11.6% Miscellaneous unique sequences 25.9% Protein-coding genes Introns Gregory (2005), Nature 35 Transposable Element - Mobile Elements / Jumping genes • Barbara McClintock (1902 - 1992) • studies in the 40’s & 50’s of spotted kernels in maize • • discovery of “controlling elements” • Nobel prize in 1983 initially thought to be unique to maize but later also found in eukaryotes, bacteria, viruses, phages & plasmids wikipedia.org 36 Transposable Element - Mobile Elements / Jumping genes • • DNA Transposons • transposase cuts out transposon & inserts it at the target site • • “cut-and-paste” mechanism prokaryotes & eukaryotes Retrotransposons • • • • transposon DNA transcribed to RNA insertion to genome by reverse transcription LTR, LINEs, SINEs eukaryotes only wikipedia.org 37 38