Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1 Course requirements • Attendance and participation • Two reading assignments • A final take home papers reading-based exam • website No meeting next week on Nov 15th 2 Expression regulation of genes determines complex spatio-temporal patterns 3 Monitor expression during cell cycle mRNA expression level 4 3 2 1 0 -1 -2 0 5 10 15 G1 S G2 M G1 S G2 M Time 4 Genes can be clustered based on time-dependent expression profiles 1.2 0.7 0.2 -0.3 1 2 -0.8 -1.3 -1.8 Time -point 3 Normalized Expression Normalized Expression Time-point 3 Normalized Expression Time-point 1 1.5 1 0.5 0 -0.5 1 2 3 -1 -1.5 Time -point 1.5 1 0.5 0 -0.5 1 2 3 -1 -1.5 -2 5 Time -point The K-means algorithm • Start with random positions of centroids. Iteration = 0 6 K-means • Start with random positions of centroids. • Assign data points to centroids Iteration = 1 7 K-means • Start with random positions of centroids. • Assign data points to centroids. • Move centroids to center of assigned points. Iteration = 1 8 K-means • Start with random positions of centroids. • Assign data points to centroids. • Move centroids to center of assigned points. • Iterate till minimal cost. Iteration = 3 9 An expression cluster 1D and 2D clustering of gene expression data Hierarchical clustering How to join sets? a b c f e d Gene y How to measure a distance between expression profiles? Gene y Gene x t2 t4 t3 t5 t1 Gene x 14 Clustering the data Try these two applets at home (needs java) http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletH.html http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html The common distance matrices 16 Promoter Motifs and expression profiles CGGCCCCGCGGA CTCCTCCCCCCCTTC TGGCCAATCA ATGTACGGGTG 17 Formaldehyde crosslinks living yeast cells = epitope tag TF = TF of interest Inside the yeast nucleus: TF Binding site Binding site Binding site Harvest and sonicate; results in DNA fragments (some of which are bound to proteins) ChIP - chromatin immunoprecipitation Reversal of the crosslinks to separate DNA segments from proteins, and fluorescence labeling of each pool separately (unenriched DNA) (enriched DNA) hybridization to DNA array of all yeast intergenic sequences 18 P-value, or confidence level, for each spot in array The total number of protein-DNA interactions in the location analysis data set, using a range of P value thresholds: P-value 0.05 35,365 interactions A P-value was selected which minimizes false positives, at the expense of gaining false negatives. P-value 0.001 P-value 0.01 12,040 interactions P-value 0.005 8,190 interactions P-value 0.001 3,985 interactions 19 Genome-wide Distribution of Transcriptional Regulators • Promoter regions of 2343 of 6270 yeast genes (37%) were bound by 1 or more of the 106 transcriptional regulators (P=.001) At P= 0.001, significantly more intergenic regions bind 4 or more regulators than expected by chance Avg.: regulator binds 38 promoter regions 20 Network Motifs 21 Network Motifs 22 Network Motifs in the Yeast Regulatory Network -Based on algorithmic analyses performed in Matlab; http://jura.wi.mit.edu/cgibin/young_public/navframe.cgi?s=17&f=networkmotif Protein Gene 3 10 49 90 81 18 23 The Cell Cycle Transcriptional Regulatory Network: Ovals represent regulators, connected to genes they regulate Blue boxes represent sets of genes bound by a common set of regulators. Each box is positioned according to the time of peak expression levels for the genes represented by the box. Various stages of cell cycle 24 Length of arc defines the period of activity of that regulator Network of Transcriptional Regulators Binding to Genes Encoding Other Transcriptional Regulators 25 Network of Transcriptional Regulators Binding to Genes Encoding Other Transcriptional Regulators 26 Network of Transcriptional Regulators Binding to Genes Encoding Other Transcriptional Regulators 27 The Central Dogma of Molecular Biology Expressing the genome RNA Inactive DNA DNA mRNA f Protein f 28 Translation consists of initiation, elongation and termination STOP 3’ 5’ Anti-codon 29 Codon The ribosome attachment site determines initiation rate Yeast E. coli 30 A consensus for S. cerevisiae ribosome attachment sites? 100% 0% position relative to ATG sequence How good is it as a “ribosomal attachment site” ? ribosomal attachment site score 31 GCG GCG GCG CAG GCG CTG GCG GCG GCG GCG CGC 3’ 5’ 32 The sequence adaptation score of proteins in yeast good score CRP ribosomal attachment site score 120 100 80 60 40 20 0 bad score bad Rank good 33 Multiple codons for the same amino acid C1 C2 C3 C4 C5 C6 Serine: UCU UCC UCA UCG AGC AGU Cysteine: UGU UGC Methionine: UGG STOP: UAA, UAG UGA 34 G T R Y E C Q A S F D C1C1C1C1C1C1C1C1C1C1C1 C2C2C2C2C2C2C2C2C2C2C2 C1C1C2C1C1C2C1C1C2C1C1 C2C2C2C2C1C1C1C1C1C1C1 C1C1C1C1C1C1C1C2C2C2C2 For a hypothetical protein of 300 amino acids with two-codon each, There are 2^300 possible nucleotide sequences These variants will code for the same protein, and are thus considered “synonymous”. Indeed evolution would easily exchange between them 35 Selection of codons might affect: Accuracy Throughput RNA-structure Costs Folding 36 A simple model for translation efficiency … ATC The tRNA Adaptation Index (tAI) CCA AAA TCG … AAT … Wobble Interaction Wi wi = ni (1 s ij j 1 )tRNAij Wi/Wmax if Wi0 wmean else { 1/ tAIg w i k dos Reis et al. NAR 2004 k1 g g 37 Supply demand and charging 38 How the RNA structure influences translation? ? 39 Protein abundance No correlation between CAI and protein expression Protein abundance Conclusions from synthetic library Positive correlation between structure’s energy and expression The 5’ window needs to be un-folded for high expression 40 Formaldehyde crosslinks living yeast cells = epitope tag TF = TF of interest Inside the yeast nucleus: TF Binding site Binding site Binding site Harvest and sonicate; results in DNA fragments (some of which are bound to proteins) ChIP - chromatin immunoprecipitation Reversal of the crosslinks to separate DNA segments from proteins, and fluorescence labeling of each pool separately (unenriched DNA) (enriched DNA) hybridization to DNA array of all yeast intergenic sequences 41 A genome-wide method to measure translation efficiency (Ingolia Science 2009) 42 Translational response to starvation 43 The Central Dogma of Molecular Biology Expressing the genome RNA Inactive DNA DNA mRNA f Protein f 44 Production mRNA Option 1 Option 2 abundance degradation Option 3 Option 4 45 Relationship between gene expression levels and mRNA decay rates across genes. A study in human population examined decay and steady-state mRNA level variation across people. Found strong negative or positive correlations between mRNA level and decay rates. Fast responding genes show “discordant” relation suggesting that increased expression is often accompanied by increased decay rate The various phases are coupled 47 At the hardware level (posttranscription: RNA binding proteins) G1 1 1 1 0 G2 1 0 0 1 G3 0 1 1 1 48 At the hardware level (posttranscription: microRNA) RISC RISC RISC RISC G1 1 1 1 0 G2 1 0 0 1 G3 0 1 1 1 49 50 Yang CGFR 16:397, 2005 Computational approaches to find microRNA genes • MiRscan (Lim, et al. 2003) – Scan to find conserved hairpin structures in both C. elegans and C. briggsae. – Using known microRNA genes (50) as training set. 51 What is the effect of over expression of a miR? 52 None-Coding RNAs are often co- targeted with their own targets for various cellular needs 53 miR-124 decreases similarly the abundance and translation of mRNA targets 54 miRs microRNA expression profiles classify human cancers 55 Samples (patients) Lu et al. Nature 435: 834, 2005 Gene expression is noisy 56 Fluorescence distribution shapes 57 The cell intrinsic and extrinsic contributions to noise 58 The actual intrinsic and extrinsic sources of noise: Extrinsic – variation in copy numbers of molecules among cells; Intrinsic: stochastic events Extrinsic Regulation by transcription factors Ribosome RNA Polymerase Φ Intrinsic Chromatin remodeling Protein degradation RNA DNA Transcription process Translation process Protein 59 A theoretical approach DNA mRNA Protein 60 The ratio of transcription to translation should affect noise 61 Transcription bursts should affect noise 62 Can noise be useful? The native net shows longer and more duration-diverse competence periods Native networks does better on a wider range of extracellular [DNA] The trade-off: High competence allows finding solutions, but reduces growth rate Questions about noise • What are the sources of noise? • How is noise regulated in cells • How is it tolerated by the biological systems that need to be noise free? • When is noise advantageous /deleterious/ neutral? 66