Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Human Genome, and Human Complexity Yoni Toker Kolmogorov: Complexity of an object is the shortest length of a computer program that creates the object Viewpoint GENE NUMBER: What If There Are Only 30,000 Human Genes? Science 16 February 2001: Jean-Michel Claverie* Vol. 291. no. 5507, pp. 1255 - 1257 Humans: ~ 30,000 genes Worm (Caenorhabditis elegans): ~20,000 genes Are we not much more complicated than worms? Mapping of the Human genome 1953 Rosalind Franklin, James Watson and Francis Crick discover the double helical structure of DNA. Mid 1980’s Human Genome Project Suggested Objections to the Human Genome Project •Too hard: Human genome is 3e+9 base pairs long. A lab (in the 1980’s) could sequence 500 base pairs a day. Base pairs Days a year 3e+9/500/365 ~ 16,000 BP’s a day years Objections to the Human Genome Project •Too hard: Human genome is 3e+9 base pairs long. A lab (in the 1980’s) could sequence 500 base pairs a day. •Too expensive! •Not the way to do biology: Biology is hypothesis driven experiments, not a fishing expedition Mapping of the Human genome 1953 Rosalind Franklin, James Watson and Francis Crick discover the double helical structure of DNA. Mid 1980’s Human Genome Project Suggested 1990 Human Genome project announced: Goal: sequence the entire human genome in 15 years, with a budget of $3 billion Comparison: LHC budget ~5 billion Aircraft carrier ~10 billion Mapping of the Human genome 1953 Rosalind Franklin, James Watson and Francis Crick discover the double helical structure of DNA. Mid 1980’s Human Genome Project Suggested 1990 Human Genome project announced: Goal: sequence the entire human genome in 15 years, with a budget of $3 billion 1998 Only 5% of genome sequenced I (Celera) will decode the entire human genome in just 3 years with a budget of only $300 Million Dollars Sequencing small pieces of DNA primer A C G A T A C C G T C A T A F. Sanger et al., Nature 265, 687 (1977). E. C. Strauss, J. A. Kobori, G. Siu, L. E. Hood, Anal. Biochem. 154, 353 (1986). Sequencing small pieces of DNA primer primer primer primer primer A C G A T A C T G A C G A T A C T G C T A A C G A T A C T G C A C G A T A C T G A C G A T A C C G A T A C C T T C T primer A T G Sequencing small pieces of DNA T T G C T G T G C T G C T G T T A T C Sequencing Large DNAs The whole shotgun method Fierce competition .. Comes to a draw June 26, 2000 President Clinton, with J. Craig Venter, left, and Francis Collins, announces completion of "the first survey of the entire human genome." Technology is getting better: Solexa sequencing Technology is getting better! size of largest project (bp) 10 10 10 8 Sequencing Syntheis 1e+5 10 10 10 6 4 2 0 10 1960 1970 1980 1990 Year of Publication 2000 Oligonucleotide Synthesis • 1) De-Blocking DMT= dimethoxytrity dichloroacetic acid (DCA) or trichloroacetic acid in dichloromethane (DCM) DMT DMT A A A T C G A T A T C G A T Oligonucleotide Synthesis • 1) De-Blocking •2) Base Condensation DMT DMT C C C DMT C A A DMT C A C A A A Oligonucleotide Synthesis • 1) De-Blocking A •2) Base Condensation C •3) Capping A •4) Oxidation A C A DMT C A A Oligonucleotide Synthesis • 1) De-Blocking A •2) Base Condensation C •3) Capping A •4) Oxidation DMT A A DMT C A A DMT C A A DNA Synthesis Genetic Code 4 base pairs 20 amino acids Every 3 base pairs code for an amino acid Example: CCG Proline From DNA to Proteins Some of the things we learned •Human genome contains 3e+9 base pairs •Less then 2% of the genome is genes •Gene average length 3,000 base pairs •Number of genes ~30,000 •98% genes identical between all people: only 1-2% of genes responsible for color of eyes, genetic diseases… Genome Size Species Human Fruit fly (Drosophila melanogaster) Baker's yeast (Saccharomyces cerevisiae) Worm (Caenorhabditis elegans) E. coli Arabidopsis (Arabidopsis thaliana) Size of genome Number of genes 2900 e+6 base pairs 30,000 120 e+6 base pairs 13,601 12 e+6 base pairs 275 ,6 97 e+6 base pairs 19,000 4.1 e+6 base pairs 4,800 125 e+6 base pairs 25,000 Viewpoint GENE NUMBER: What If There Are Only 30,000 Human Genes? Science 16 February 2001: Jean-Michel Claverie* Vol. 291. no. 5507, pp. 1255 - 1257 Humans: ~ 30,000 genes Worm (Caenorhabditis elegans): ~20,000 genes Are we not much more complicated than worms? Viewpoint GENE NUMBER: What If There Are Only 30,000 Human Genes? Jean-Michel Claverie* •Are we really more complicated then flies and worms? • 30,000 is much more complicated then 20,000 • Gene number isn’t everything 30,000 is much more complicated then 20,000 230,000 220,000 3000 210,000 ~ 10 ~ Gene Number isn’t everything mRNA 30,000 genes, but more than 85,000 mRNA species Alternative splicing mRNA editing Vertebrate Immune System Gene sites Anti body Complexity comes from more sophisticated regulation mechanisims! More sophisticated methods of gene expression and regulation mRNA mRNA editing … Proteins change their function: •Number of sugars attached •Folding/Unfolding •…. Genetic Networks Calverie: Every gene connected on average to 4-5 other genes We are not much more complicated then an airplane! But: Genetic networks follow a power law distribution Genetic Networks Number of connections Average is not very meaningful! Summary Human Genome Project •Decoding the “part list” of humans •Extraordinary technological advances Complexity: Genome is just the beginning Aim High! Dream On! Aim High Dream On! •Sequence more and more organisms •Find the genes for genetic diseases •Reconstruct the tree of life •Learn more of nature’s tricks •Creation of Synthetic life • DNA nanotechnology • Producing clean energy, depositing C02…