* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download file
Deoxyribozyme wikipedia , lookup
Secreted frizzled-related protein 1 wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Genetic engineering wikipedia , lookup
Expression vector wikipedia , lookup
Transposable element wikipedia , lookup
Community fingerprinting wikipedia , lookup
Genomic library wikipedia , lookup
Point mutation wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Gene expression wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene regulatory network wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Gene expression profiling wikipedia , lookup
Non-coding DNA wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Molecular evolution wikipedia , lookup
Regulatory Genomics Lecture 1 November 2012 Yitzhak (Tzachi) Pilpel 1 Course requirements • Attendance and participation • Five reading assignments • A final take home papers reading-based exam • website In total 13 or 14 meetings (not 17…) No meeting on Nov 15th 2 Genomics marked the beginning of a new age in biology and medicine Rediscovery of Mendel's laws helps establish the science of genetics Huntington disease gene mapped to chromosome 4 Sanger and Gilbert derive methods of sequencing DNA 1900 1977 Genetic and physical mapping 1983 1994-98 Working Draft of the human genome sequencing complete 2000 2005 1953 Watson and Crick identify DNA (the double helix) as the Chemical basis of heredity 1980 DNA markers used to map human disease genes to chromosomal regions Source: Health Policy Research Bulletin, volume 1 issue2, September 2001 1990 Human Genome Projects (HPG) begins-an international effort to map and sequence all the genes in the human genome 1998 DNA markers used to map human disease genes to chromosomal regions Release of Human Genome Project 3 The genome browser 4 Link Number of protein coding genes 20,210 13,601 Mustered (Arabidopsis) Fruit fly Mouse 19,735 Worm (C elegans) 20,568 5,616 Yeast (S Cerevisiae) 482 Mycoplasma genitalium 5 How comes we have so few genes give that we are so complex??? •We have many non-protein coding genes 21,710 •Our genes are longer and more complex •Regulation of human genes activity is more complex •Repeats (formerly known as “junk DNA” (yet not garbage) contribute to complexity •Combinatorial interactions among genes and products 19,735 6 The hierarchical structure of the genome 7 Lodish et al. Molecular Biology of the Cell (5th ed.). W.H. Freeman & Co., 2003. Expressing the genome 8 The Central Dogma: a cellular context 915 The Central Dogma of Molecular Biology Expressing the genome RNA Inactive DNA DNA mRNA f Protein f 10 Evolution 11 Corrected view of evolution 12 The tree of life 13 How genomes evolve? Consider two distinct possibilities: •Genomes evolve by lots of denovo “inventions” •Genomes evolve predominantly by mixing and matching existings parts 14 Classification of protein structures 15 Very slow growth in number of protein folds Very few structural “inventions” 16 Comparing a certain family (e.g. kinases) in different species reveals few “inventions” 17 Analogy: •Technology •Language 18 Some basic evolutionary operations • Mutating existing DNA • Change gene expression profiles • Duplications of existing material (genes, chromosomes, genomes) • Transfer of genes from one organism to another • Functionalization of “junk DNA” • Reverse transcription?? 19 Stress condition induce high DNA replication error rate Because most newly arising mutations are neutral or deleterious, it has been argued that the mutation rate has evolved to be as low as possible, limited only by the cost of error-avoidance and error-correction mechanisms. But up to one per cent of natural bacterial isolates are 'mutator' clones that have high mutation rates. We consider here whether high mutation rates might play an important role in adaptive evolution. Models of large, asexual, clonal populations adapting to a new environment show that strong mutator genes (such as those that increase mutation rates by 1,000-fold) can accelerate adaptation, even if the mutator 20 gene remains at a very low frequency (for example, 10[-5]). … Some basic evolutionary operations • Mutating existing DNA • Change gene expression profiles • Duplications of existing material (genes, chromosomes, genomes) • Transfer of genes from one organism to another • Functionalization of “junk DNA” • Reverse transcription?? 21 A slight change in expression program can make a big change: olfactory receptor can “smell the egg” 22 Science. 2003 Mar 28;299(5615):2054-8. Identification of a testicular odorant receptor mediating human sperm chemotaxis. Spehr M, Gisselmann G, Poplawski A, Riffell JA, Wetzel CH, Zimmer RK, Hatt H. Source Department of Cell Physiology, Ruhr University Bochum, 150 University Street, D-44780 Bochum, Germany. Abstract Although it has been known for some time that olfactory receptors (ORs) reside in spermatozoa, the function of these ORs is unknown. Here, we identified, cloned, and functionally expressed a previously undescribed human testicular OR, hOR17-4. With the use of ratiofluorometric imaging, Ca2+ signals were induced by a small subset of applied chemical stimuli, establishing the molecular receptive fields for the recombinantly expressed receptor in human embryonic kidney (HEK) 293 cells and the native receptor in human spermatozoa. Bourgeonal was a powerful agonist for both recombinant and native receptor types, as well as a strong chemoattractant in subsequent behavioral bioassays. In contrast, undecanal was a potent OR antagonist to bourgeonal and related compounds. Taken together, these results indicate that hOR17-4 functions in human sperm chemotaxis and may be a critical component of the 23 fertilization process. Some basic evolutionary operations • Mutating existing DNA • Change gene expression profiles • Duplications of existing material (genes, chromosomes, genomes) • Transfer of genes from one organism to another • Functionalization of “junk DNA” • Reverse transcription?? 24 Gene duplication might provide redundancy duplication nonfunctionalization neofunctionalization subfunctionalization 25 chromosome III duplicates in heat 2 log2(expression evo39 / evo30) P value < 10e-100 1.5 1 0.5 0 -0.5 -1 all genes -1.5 -2 chromosome III genes 500 1000 1500 2000 2500 3000 3500 Gene Index 4000 4500 5000 5500 26 Heat shock tolerance correlates with chromosome III copy number 3.5 3 Relative Survival 2.5 2 1.5 1 0.5 0 Evolved 3 copies WT, 3 copies WT Two copies WT One copy 27 Conclusions from the experiment • Chromosomes are easily gained and lost in yeast evolution • A more fine-tuned solution may follow chromosome duplication • A sticking similarity between repetitive experiments • A chromosome-condition specificity? 28 29 Many gene duplicate distances Correspond to 60-70 mya!! Sequences similarity between gene pairs 30 Some basic evolutionary operations • Mutating existing DNA • Change gene expression profiles • Duplications of existing material (genes, chromosomes, genomes) • Transfer of genes from one organism to another • Functionalization of “junk DNA” • Reverse transcription?? 31 Horizontal (“lateral”) gene transfer: transfer genes between organisms – mostly in stress 32 Some basic evolutionary operations • Mutating existing DNA • Change gene expression profiles • Duplications of existing material (genes, chromosomes, genomes) • Transfer of genes from one organism to another • Functionalization of “junk DNA” • Reverse transcription?? 33 Evolution of transcriptional switches TF1 TF1 Similar function CACGCGTT CACGCGTA Neutral selection TF1 Disrupted function CACGCGTT CACGAGTT Low rate purifying selection TF2 TF1 Altered function CACGCGTT CACACGTT Low rate purifying selection Gained function CACGCGTT CACACGTT Low rate purifying selection 34 Evolution of transcription networks 35 [email protected] 36 Repetitive elements in the human genome •Alu are repetitive retrotransposons elements in the Human genome. •Alu elements are about 300 base pairs long and are therefore classified as short interspersed elements (SINEs) •There are over one million Alu elements interspersed throughout the human genome •About 10% of the human genome consists of 37 Alu sequences. Retro-transposition 38 Alus may contain binding sites for TFs, microRN Alus Alus 39 Can the phenotype shape the genotype? Classical Darwinian theory Genotype Phenotype Lamarckian Theory Genotype Phenotype 40 The Central Dogma: a cellular context 4115 42 Cell membrane * * * Nucleus cell division protein synthesis cell Attack a virus differentiate death 43 From parts to networks… * * * 44 Reporter gene reveal spatio-temporal expression programs 45 In uni-cellulars response to environmental signals affect gene expression dramatically Genes Gasch et al Mol Biol Cell. 2000 Dec;11(12):4241-57. 46 The transcriptome during the cell cycle 47 Spellman et al Mol Biol Cell. 1998 Dec;9(12):3273-97 Coding DNA strand Non-coding strand RNA 48 Transcription regulation • • • • The hardware The software The input The output 49 The initiation machinery complex 50 Transcription factors bind the DNA 51 Keys (regulators) can scan the genomes in search for their locks (recognition sites) ATACGAT 52 Transcription regulation • • • • The hardware The software The input The output 53 In the absence of Lactose http://esg-www.mit.edu:8001/esgbio/pge/lac.html 54 The Lac Operon (Jacob and Monod) In the presence of Lactose http://esg-www.mit.edu:8001/esgbio/pge/lac.html 55 In the absence of Glucose http://esg-www.mit.edu:8001/esgbio/pge/lac.html 56 Glucose Lactose The logic of the Lac operon regulation Activity + - OFF - + ON OFF - - OFF + + CAP site Operator Glucose y n Lactose n OFF OFF y ON 57 Genomic Regulatory Logic 58 DNA binding proteins for unique pathways 59 A global map of combinatorial expression control Heat-shock Cell cycle Sporulation Diauxic shift MAPK signaling DNA damage STRE *High connectivity *Hubs *Alternative partners in various conditions PHO4 CCA ALPHA1 mRPE8 mRPE57 AFT1 PDR SWI5 MIG1 mRPE69 RAP1 mRPE72 GCN4 CSRE SFF ' mRPE34 MCB mRPE58 MCM1 mRPE6 RPN4 ECB BAS1 SCB LYS14 ABF1 SFF STE12 ALPHA2 MCM1' ALPHA1' HAP234 mRRPE PAC 60 mRRSE3 Transcription regulation • • • • The hardware The software The input The output 61 AlignACE Example 5’- TCTCTCTCCACGGCTAATTAGGTGATCATGAAAAAATGAAAAATTCATGAGAAAAGAGTCAGACATCGAAACATACAT 5’- ATGGCAGAATCACTTTAAAACGTGGCCCCACCCGCTGCACCCTGTGCATTTTGTACGTTACTGCGAAATGACTCAACG 5’- CACATCCAACGAATCACCTCACCGTTATCGTGACTCACTTTCTTTCGCATCGCCGAAGTGCCATAAAAAATATTTTTT 5’- TGCGAACAAAAGAGTCATTACAACGAGGAAATAGAAGAAAATGAAAAATTTTCGACAAAATGTATAGTCATTTCTATC 5’- ACAAAGGTACCTTCCTGGCCAATCTCACAGATTTAATATAGTAAATTGTCATGCATATGACTCATCCCGAACATGAAA 5’- ATTGATTGACTCATTTTCCTCTGACTACTACCAGTTCAAAATGTTAGAGAAAAATAGAAAAGCAGAAAAAATAAATAA 5’- GGCGCCACAGTCCGCGTTTGGTTATCCGGCTGACTCATTCTGACTCTTTTTTGGAAAGTGTGGCATGTGCTTCACACA …HIS7 …ARO4 …ILV6 …THR4 …ARO1 …HOM2 …PRO3 300-600 bp of upstream sequence per gene are searched in Saccharomyces cerevisiae. 62 AlignACE Example The Best Motif 5’- TCTCTCTCCACGGCTAATTAGGTGATCATGAAAAAATGAAAAATTCATGAGAAAAGAGTCAGACATCGAAACATACAT …HIS7 5’- ATGGCAGAATCACTTTAAAACGTGGCCCCACCCGCTGCACCCTGTGCATTTTGTACGTTACTGCGAAATGACTCAACG …ARO4 5’- CACATCCAACGAATCACCTCACCGTTATCGTGACTCACTTTCTTTCGCATCGCCGAAGTGCCATAAAAAATATTTTTT …ILV6 5’- TGCGAACAAAAGAGTCATTACAACGAGGAAATAGAAGAAAATGAAAAATTTTCGACAAAATGTATAGTCATTTCTATC …THR4 5’- ACAAAGGTACCTTCCTGGCCAATCTCACAGATTTAATATAGTAAATTGTCATGCATATGACTCATCCCGAACATGAAA …ARO1 5’- ATTGATTGACTCATTTTCCTCTGACTACTACCAGTTCAAAATGTTAGAGAAAAATAGAAAAGCAGAAAAAATAAATAA …HOM2 5’- GGCGCCACAGTCCGCGTTTGGTTATCCGGCTGACTCATTCTGACTCTTTTTTGGAAAGTGTGGCATGTGCTTCACACA …PRO3 AAAAGAGTCA AAATGACTCA AAGTGAGTCA AAAAGAGTCA GGATGAGTCA AAATGAGTCA GAATGAGTCA AAAAGAGTCA ********** MAP score = 20.37 63 Transcription regulation • • • • The hardware The software The input The output 64 Expression regulation of genes determines complex spatio-temporal patterns 65 Monitor expression during cell cycle mRNA expression level 4 3 2 1 0 -1 -2 0 5 10 15 G1 S G2 M G1 S G2 M Time 66 Genes can be clustered based on time-dependent expression profiles 1.2 0.7 0.2 -0.3 1 2 -0.8 -1.3 -1.8 Time -point 3 Normalized Expression Normalized Expression Time-point 3 Normalized Expression Time-point 1 1.5 1 0.5 0 -0.5 1 2 3 -1 -1.5 Time -point 1.5 1 0.5 0 -0.5 1 2 3 -1 -1.5 -2 67 Time -point The K-means algorithm • Start with random positions of centroids. Iteration = 0 68 K-means • Start with random positions of centroids. • Assign data points to centroids Iteration = 1 69 K-means • Start with random positions of centroids. • Assign data points to centroids. • Move centroids to center of assigned points. Iteration = 1 70 K-means • Start with random positions of centroids. • Assign data points to centroids. • Move centroids to center of assigned points. • Iterate till minimal cost. Iteration = 3 71 The diauxic shift Time 72 Genetic reprogramming of the yeast metabolism upon glucose deletion 73 At the beginning – when glucose is abundant Glucose 2 ADP+Pi NAD+ 2 ATP NADH Lactate Pyruvate Ferment Ferment Ethanol Respirate NADH NAD+ AcetylCoA TCA 74 ~20 hours later when glucose is depleted Glucose 2 ADP+Pi NAD+ 2 ATP NADH Lactate O2 O2 Pyruvate Ferment Ferment Ethanol Respirate NADH NAD+ AcetylCoA TCA 75 The promoter sequences of coexpressed genes 5’- TCTCTCTCCACGGCTAATTAGGTGATCATGAAAAAATGAAAAATTCATGAGAAAAGAGTCAGACATCGAAACATACAT 5’- ATGGCAGAATCACTTTAAAACGTGGCCCCACCCGCTGCACCCTGTGCATTTTGTACGTTACTGCGAAATGACTCAACG 5’- CACATCCAACGAATCACCTCACCGTTATCGTGACTCACTTTCTTTCGCATCGCCGAAGTGCCATAAAAAATATTTTTT 5’- TGCGAACAAAAGAGTCATTACAACGAGGAAATAGAAGAAAATGAAAAATTTTCGACAAAATGTATAGTCATTTCTATC 5’- ACAAAGGTACCTTCCTGGCCAATCTCACAGATTTAATATAGTAAATTGTCATGCATATGACTCATCCCGAACATGAAA 5’- ATTGATTGACTCATTTTCCTCTGACTACTACCAGTTCAAAATGTTAGAGAAAAATAGAAAAGCAGAAAAAATAAATAA 5’- GGCGCCACAGTCCGCGTTTGGTTATCCGGCTGACTCATTCTGACTCTTTTTTGGAAAGTGTGGCATGTGCTTCACACA …HIS7 …ARO4 …ILV6 …THR4 …ARO1 …HOM2 …PRO3 76 Promoter Motifs and expression profiles CGGCCCCGCGGA CTCCTCCCCCCCTTC TGGCCAATCA ATGTACGGGTG 77