* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Homework 1 - Berkeley MCB
DNA vaccination wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Non-coding RNA wikipedia , lookup
History of genetic engineering wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Non-coding DNA wikipedia , lookup
History of RNA biology wikipedia , lookup
Deoxyribozyme wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Messenger RNA wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Point mutation wikipedia , lookup
Epitranscriptome wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Quantitative Biology Module Fall 2016 Homework 1: Biological Numeracy And A Feeling for the Organism Hernan G. Garcia August 26, 2016 Homeworks in the Quantitative Biology Module Over the course of the module, we will post a few homeworks. These will include several problems aimed at cementing the skills and ideas developed during the course, as well as expanding them. They will not be graded. However, we will post solutions in a passwordprotected website as well as set up office hours for discussion of of these problems or any other topic from the lectures. The objective of this homework is to get a feeling for the numbers in whatever problem you’re considering in biology. Just like you always need to check the units in your calculations, a more subtle sanity check of your theoretical results stems from having some expectation about the order of magnitude you will obtain. A Feeling for the Numbers in Biology In this section, we flex our quantitative muscles a little bit more to develop various estimated regarding biological systems. 1 Protein Sequences: The Frances Arnold Estimate Problem In a 2001 Bioengineering seminar at Caltech, Professor Frances Arnold made a startling remark that it is the aim of the present problem to examine. The basic point is to try and generate some intuition for the HUGE, ASTRONOMICAL number of ways of choosing amino acid sequences. To drive home the point, she noted that if we consider a protein with 300 amino acids, there will be a huge number of different possible sequences. (a) How many different sequences are there for a 300 amino acid protein? But that wasn’t the provocative remark. The provocative remark was that if we took only one molecule of each of these different possible proteins, it would take a volume equal to five 1 of our universes to contain all of these different distinct molecules. (b) Estimate the size of a protein with 300 amino acids. Justify your result, but remember it is an estimate. Next, find an estimate of the size of the universe and figure out whether Frances was guilty of hyperbole or if her statement was on the money. 2 DNA Synthesis Over Your Lifetime Estimate the total length of DNA your body will produce over your lifetime. Hint: Figure out a way to estimate how often some of the cells in your body are replaced. 3 ATP Synthesis Estimate the daily ATP synthesis by ATP synthases in a human. To do this estimate, imagine a typical human diet and hypothesize that roughly half of the caloric input is converted into ATPs. What is the mean rate of ATP synthesis per cell in the human body resulting from this estimate? If each ATP synthase synthesizes roughly 100 ATPs per second (see Alberts, MBoC, chap. 14), what does this imply about the number of such ATP synthases per cell? Note: the entirety of this question is to give us a feeling for the numbers and the actual numbers might be substantially different because of cell type and physiology. E. coli by the Numbers 4 Is There Enough Time to Replicate the Genome? E. coli’s circular genome is about 5 million base pairs long. A special sequence called the origin of replication recruits two replisomes that replicate DNA as they move in opposite directions. Each replisome moves at a rate of about 200-1000 bp/s. (a) How long does it take for these two replisomes to make a copy of the E. coli genome? (b) How can E. coli manage to replicate as fast as every 20 minutes? (Problem adapted from Moran et al., Cell 141:1262-1262 e1 (2010).) 5 E. coli in culture (a) A saturated E. coli culture contains approximatelly 109 cells/ml. What’s the mass density of such culture? What’s the mean spacing between cells? (b) DNA replication in E. coli introduces 10−9 mutations/bp on each DNA strand. Estimate the total number of mutations introduced in a saturated culture. Hint: Estimate the number of cell divisions in the last round of replication before reaching saturation. (Problem adapted from Moran et al., Cell 141:1262-1262 e1 (2010).) 6 Mind your media: carbon, nitrogen, and phosphate content of cells Minimal growth medium for bacteria such as E. coli includes various salts with characteristic concentrations in the mM range and a carbon source. The carbon source is typically glucose 2 and it is used at 0.5% (a concentration of 0.5 g/100 mL). For nitrogen, minimal medium contains ammonium chloride (NH4 Cl) with a concentration of 0.1 g/100 mL. (a) Make an estimate of the number of carbon atoms it takes to make up the macromolecular contents of a bacterium such as E. coli. Similarly, make an estimate of the number of nitrogens it takes to make up the macromolecular contents of a bacterium? What about phosphate? Hint: You can use the Table 1. (b) How many cells can be grown in a 5 mL culture using minimal medium before the medium exhausts the carbon? How many cells can be grown in a 5 mL culture using minimal medium before the medium exhausts the nitrogen? Note that this estimate will be flawed because it neglects the energy cost of synthesizing the macromolecules of the cell. The Central Dogma by the Numbers 7 Transcription by the Numbers If we consider a characteristic transcription rate of roughly 50 nucleotides/s (the range of experimental values is, say, 10-70 nucleotides/s - see Bionumbers), and we note that the footprint of the polymerase on the DNA is of order 50 nucleotides (the actual value is closer to 60 nucleotides), compute the maximum rate at which the transcription apparatus can produce new transcripts. Table 1: Observed macromolecular census of an E. coli cell. (Data from F. C. Neidhardt et al., Physiology of the Bacterial Cell, Sunderland, Sinauer Associates Inc., 1990 and M. Schaechter et al., Microbe, Washington DC, ASM Press, 2006.) Substance % of total dry weight Number of molecules Macromolecule Protein RNA 23S RNA 16S RNA 5S RNA Transfer RNA (4S) Messenger RNA Phospholipid Lipopolysaccharide (outer membrane) DNA Murein (cell wall) Glycogen (sugar storage) Total macromolecules Small molecules 55.0 20.4 10.6 5.5 0.4 2.9 0.8 9.1 3.4 3.1 2.5 2.5 96.1 Metabolites, building blocks, etc. Inorganic ions Total small molecules 2.9 1.0 3.9 3 2.4 × 106 19,000 19,000 19,000 200,000 1,400 22 × 106 1.2 × 106 2 1 4,360 8 Translation by the numbers (a) Ribosomes can synthesize new proteins at a rate of roughly 20 aa/s. Following logic similar to that we used for transcription, use the fact that the footprint of the ribosome on the mRNA is roughly 35 nucleotides wide and determine the maximal rate at which new proteins could be synthesized? (b) Given a lifetime of an mRNA molecule in E. coli of about 5 min, how many proteins can be produced off of this transcript during its lifetime? (c) If you use this as the mean number of proteins per gene, how many total proteins would you estimate in E. coli? Flies by the numbers 9 Making the Fly In this problem, we carry out estimates about the number of cells that make different structures in the fly. Describe your calculations in detail. (a) Estimate the number of cells in the fly wing. At three hours into development, there are about 6 cells that will make the future wing imaginal disc. Calculate the number of divisions to get from this stage to an adult wing and the time between mitosis. To learn more about this, you might want to check out Abouchar et al. (2014), J R Soc Int 11:20140443 and Garcia-Bellido et al. (1979), Scientific American 241:102. (b) How many cells are there in a fly eye? Use this chance to learn about ommatidia, and the number of cells and of photoreceptors. 10 Transcription and translation in development (a) The average length of a gene in Drosophila melanogaster is about 11 kb and the average elongation rate of a transcript is about 1.2 kb/min. How does the time to produce an average mRNA compare to the nuclear cycle times in the initial stages of fly development? For example, nuclear cycles 9 through 11 last no more than 6 minutes, while nuclear cycle 12 lasts about 10 minutes and cycle 13 on the order of 12 minutes. How do the genes that are actually expressed in this stage compare to the average gene in terms of their lengths? You can search for the size of these genes by going to flybase.org and searching for hb (hunchback), gt (giant), kr (krüppel) and kni (knirps). (b) When Drosophila eggs are laid they already contain mRNA for several “maternal factors”. Bicoid is an example of such a factor. Its mRNA is localized at the anterior end of the embryo, serving as a source of Bicoid protein. It is essentially stable up until the end of nuclear cycle 14 when it gets actively degraded. In this problem we want to estimate the number of mRNA molecules deposited in the embryo by its mother from measurements of the number of Bicoid proteins at nuclear cycle 14. Assume that all Bicoid is localized to the nuclei, which at cycle 14, approximately 120 minutes after the egg is laid, have a radius of about 3.3 µm. Use the data shown in Figure 1(B) in order to estimate the total number of Bicoid molecules in the whole embryo at this point. Assuming that translation of bicoid mRNA is constant, estimate the number of mRNA molecules that led to your calculated 4 (A) (B) (C) 50 150 40 100 30 20 10 50 mm 0 0 200 raw anti-Hb intensity [Bcd]nuc(nM) 60 0.2 0.4 0.6 x/L 0.8 1 50 0 0 50 100 150 200 raw anti-Bcd intensity Figure 1: Spatial patterning of Bicoid and Hunchback. (A) Fluorescent image showing Bicoid (green) and Hunchback (red) which are both localized to the anterior half of the embryo. DNA is stained with a different color (blue) allowing for individual nuclei to be identified. (B) Quantification of the bicoid gradient as a function of relative position with respect to the egg length on the dorsal (red) and ventral (blue) sides of the embryo. The black points denote the background fluorescence intensity of an embryo that does not harbor fluorescent proteins. (C) Scatter plot of Hunchback versus Bicoid fluorescence intensity from 1299 identified nuclei in a single embryo shows a sharp onset of Hunchback with increasing Bicoid concentration. (Adapted from T. Gregor et al., Cell 130:153, 2007.) ribosomes RNA DNA transcription start Figure 2: Electron microscopy image of simultaneous transcription and translation. The image shows bacterial DNA and its associated mRNA transcripts, each of which is occupied by ribosomes. (Adapted from O. L. Miller et al., Science 169:392, 1970.) number of Bicoid proteins. You might find it useful to estimate the number of ribosomes per kb on a transcript from Figure 2 and to use the translation rate discussed in Problem 8. 11 Mutation correlation and physical proximity on the gene In Section 4.6.1 of Physical Biology of the Cell, Sturtevant’s analysis of mutant flies that culminated in the generation of the first chromosome map is briefly described. For a more detailed explanation refer to Stutervant, Journal of Experimental Zoology, 14:43 (1913) (a version of this paper with a modern introduction can be found on the course website). In Table 2, we show the crossover data associated with the different mutations that he used to draw the map. A crossover refers to a chromosomal rearrangement in which parts of two chromosomes exchange DNA. An illustration of the process is shown in Figure 11. The six factors looked at by Sturtevant are B, C, O, P, R, and M. Flies recessive in B, the black factor, have a yellow body color. Factors C and O are completely linked, they always go together and flies recessive in both of these factors have white eyes. A fly recessive in factor P has vermilion eyes instead of the ordinary red eyes. Finally, flies recessive in R have rudimentary wings and those recessive in M have miniature wings. For example, the fraction of flies that presented a crossover of the B and P factors is denoted as BP. Assume that the 5 Table 2: Fraction of crossovers of six sex-linked factors in Drosophila. (Adapted from A. H. Sturtevant, J. Exp. Zool. 14:43, 1913.) Factors Fraction of crossovers BR B(C,O) (C,O)P (C,O)R (C,O)M PR PM BP BM 115/324 214/21736 471/1584 2062/6116 406/898 17/573 109/458 1464/4551 260/693 frequency of recombination is proportional to the distance between loci on the chromosome. Reproduce Sturtevant’s conclusions by drawing your own map using the first seven data points from Table 2. Keep in mind that shorter “distances” are more reliable than longer ones because the latter are more prone to double crossings. Are distances additive? For example, can you predict the distance between B and P from looking at the distances B(C,O) and (C,O)P? What is the interpretation of the two last data points from Table 2? 6 (A) 1 2 3 4 5 6 1/5 * * * * (B) 1 2 3 4 5 6 3/5 * * Figure 3: Crossing over of chromosomes. (A) Chromosomes before crossing over showing two loci labeled P and M. (B) Illustration of the crossing over event. (C) Chromosomes after crossover. 7