* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Anatomy of the Human Genome
Extrachromosomal DNA wikipedia , lookup
Gene desert wikipedia , lookup
Copy-number variation wikipedia , lookup
Y chromosome wikipedia , lookup
Ridge (biology) wikipedia , lookup
Point mutation wikipedia , lookup
Human genetic variation wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Transposable element wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Neocentromere wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene therapy wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Gene expression programming wikipedia , lookup
Genomic imprinting wikipedia , lookup
Medical genetics wikipedia , lookup
Genetic engineering wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Metagenomics wikipedia , lookup
Oncogenomics wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gene expression profiling wikipedia , lookup
Whole genome sequencing wikipedia , lookup
X-inactivation wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Pathogenomics wikipedia , lookup
Genomic library wikipedia , lookup
Helitron (biology) wikipedia , lookup
Minimal genome wikipedia , lookup
Human genome wikipedia , lookup
Public health genomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
History of genetic engineering wikipedia , lookup
Microevolution wikipedia , lookup
Human Genome Project wikipedia , lookup
Genome editing wikipedia , lookup
Genome evolution wikipedia , lookup
Genome (book) wikipedia , lookup
SPECIAL COMMUNICATION The Anatomy of the Human Genome A Neo-Vesalian Basis for Medicine in the 21st Century Victor A. McKusick, MD T HE LINEAR ARRANGEMENT OF genes on our chromosomes is part of our microanatomy. When we speak of mapping genes on chromosomes, we use a cartographic metaphor. An equally appropriate anatomic metaphor is the anatomy of the human genome.1-3 Clinical cytogenetics (starting in the late 1950s), mapping genes on chromosomes (beginning for autosomes in the late 1960s), and comprehensive DNA sequencing of the genome (initiated in the late 1980s) have provided, in the words of Charles Scriver, MDCM (oral communication, 1982), a neo-Vesalian basis for medicine. The influence on medicine is fully as great as was that of Andreas Vesalius’ de corporis humani Fabrica, which was published in 1543 and was the basis of Harvey’s physiology of the circulation (1628) and Morgagni’s morbid anatomy (1761). The history of medical genetics4 can be discussed in 2 parts, the pre-1956 foundations of medical genetics going back to Mendel and the greats of the first half of the 20th century, and the developments in the period since 1956, during which medical genetics has evolved into a full-fledged clinical and academic field. The objective of this article is to trace the influence of chromosome studies, gene mapping, and DNA sequencing (the Human Genome Project [HGP]) on the evolution of medical genetics since 1956. Clinical Cytogenetics (Chromosomology) 1956 was a watershed year in the history of medical genetics. In that year, the Since 1956, the anatomy of the human genome has been described on the basis of chromosome studies, gene mapping, and DNA sequencing. The gross anatomy of Andreas Vesalius, published in 1543, played a leading role in the development of modern medicine. The objective of this article is to show that knowledge of genomic anatomy is having a comparably strong and pervasive influence on all of medicine. The research revealing human genome anatomy is reviewed. The insight provided by genome anatomy has brought about shifts of focus, both in research and in the clinic, eg, from genomics to proteomic and from the individually rare, single-gene disorders to common disorders. Genomic anatomy permits medicine to become more predictive and preventive. At the same time, diagnosis and treatment are rendered more sensitive, specific, effective, and safe. Hazards in misuse and misunderstanding of the information exist. Education of both the public and health professionals is vital if the full benefits of neo-Vesalian medicine are to be realized. www.jama.com JAMA. 2001;286:2289-2295 correct diploid chromosome number of 46 (not 48, as previously thought) was established.5,6 It is remarkable that it was not until 3 years after the determination of the double-helical structure of DNA by Watson and Crick7 that the correct number of chromosomes in humans was determined. The advance was significant to medicine, not because of the specific numerology but because of the associated simple improvements in technique that made chromosome analysis feasible in the study of disease and in clinical diagnosis. Medical genetics, which really did not exist as a clinical specialty before 1956, was given its own organ, the nucleus, just as cardiology had the heart, neurology the nervous system, etc. Not only was the correct chromosome number established with the improved techniques, but also, in 1959, Jerome Lejeune found the additional small chromosome underlying ©2001 American Medical Association. All rights reserved. mongolism (mercifully renamed Down syndrome)8 and others described the numerical abnormalities of the sex chromosomes in the Turner and Klinefelter syndromes that same year. In the early 1960s, abnormalities in chromosome number and structure were described in other congenital malformation syndromes, such as trisomies 13 and 18, and in a variety of translocations, deficiencies, mosaics, and, in spontaneously aborted tissue, triploidy. The finding of a rather consistent chromosomal change in Author Affiliation: McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Md. Disclosure: Dr McKusick was a member of the Program Advisory Committee for the National Institutes of Health Human Genome Project 1989-1992 and is a member of the Scientific Advisory Board of Celera Genomics Inc. Corresponding Author and Reprints: Victor A. McKusick, MD, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins Hospital, 600 N Wolfe St, Blalock Bldg, Room 1007, Baltimore, MD 21287-4922 (e-mail: [email protected]). (Reprinted) JAMA, November 14, 2001—Vol 286, No. 18 Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005 2289 THE ANATOMY OF THE HUMAN GENOME Figure 1. Progress in Mapping of Genes to Specific Chromosomes Through the Early Stages of the Human Genome Project 2500 X-Chromosomal Autosomal No. of Genes Mapped 2000 1500 1000 500 0 68 70 72 74 76 78 80 82 84 86 88 90 19 19 19 19 19 19 19 19 19 19 19 19 Year chronic myeloid leukemia9 in 1960 provided an early confirmation of Theodor Boveri’s chromosome theory of cancer.10 Named for the city of residence of its discoverers and patients following the practice of naming hemoglobin variants, the Philadelphia chromosome (Ph1) was thought to represent a partially deleted chromosome 21. By improved chromosome staining techniques, it was shown in 1973 that chromosome 22, not 21, is involved and that the change that causes the abnormally short Ph1 chromosome is not a deletion but rather a reciprocal translocation between chromosomes 9 and 22.11 The ability to study the chromosomes in cultured cells in the amniotic fluid inaugurated the field of prenatal diagnosis of Down syndrome and other chromosomal aberrations by amniocentesis beginning about 1966. The characterization in cultured cells of enzyme deficiencies in inborn errors of metabolism had also progressed to the point that many of these could likewise be diagnosed prenatally by study of amniotic fluid cells. About 1970, various staining methods were developed that showed banding of chromosomes. The distinctive pattern of this banding permitted unique identification of each chromosome. When combined with methods for studying the chromosomes at a stage of cell division when they are extended, banding methods made it possible to recognize small deletions and to interpret chromosomal rearrangements. Correlation of the specifically interpreted karyotype with phenotype led to the description of new microdeletion syndromes, such as Williams syndrome (online Mendelian Inheritance in Man [OMIM] 104050), and the related concept (and designation) of contiguous gene syndromes, eg, DiGeorge syndrome (OMIM 188400). 12 Furthermore, it allowed the large area of hematologic malignancies that result from reciprocal translocations with creation of fusion genes to be studied. The Philadelphia chromosome was the first of these; the total number of examples is now more than 100.13 In the last 20 years, molecular cytogenetics, “chromosome painting,” and in situ hybridization for identification of deletions and rearrangements are only some of the methods used for characterizing the karyotype in clinical applications. Gene Mapping The first gene to be mapped to a specific chromosome in any species was probably the one for colorblindness. In 1911, cytologist E. B. Wilson14 concluded that the characteristic pedigree pattern of this trait, described by Pliny Earle in Philadelphia, Pa, in 184515 and by Friedrich Horner in Zurich, Switzerland, in 1876, was explained if the trait is recessive, the gene is on the X chromosome, and humans have a female-XX/male-XY sex chromosome constitution. In the following decades, a host of disorders were deduced to be X-linked from the characteristic pedigree pattern, so that by publication of the second edition of Mendelian Inheritance in Man in 1968,12 the catalog of X-linked phenotypes had 68 asterisked (seemingly confirmed) entries, each presumably related to a different gene. The precise localization on the X chromosome of these genes was not known; the first regional mapping was for the linked genes for colorblindness and G6PD deficiency,16 shown to 2290 JAMA, November 14, 2001—Vol 286, No. 18 (Reprinted) be on the distal end of the long arm of the X in 1973.17 It was not until 1968, when 68 loci were already known to be on the X chromosome, that a gene was mapped to a specific autosome, ie, the Duffy blood group gene to chromosome 1.18 This was achieved by Roger Donahue, then a Johns Hopkins University PhD candidate in human genetics, through a linkage study of a chromosome 1 heteromorphism (one chromosome 1 was unusually long and appeared in the prebanding karyotypes to have an uncoiled region near the centromere) that he had found in his own family. Progress in gene mapping is shown in FIGURE 1. The largest part of the progress in the 1970s19 was through study of interspecies somatic cell hybrids, particularly cells produced by fusing human and mouse cells.20 In cells derived by cell division from such hybrid cells, the full set of mouse chromosomes are retained, whereas individual human chromosomes are lost more or less at random. The presence or absence of a particular human cell trait could be correlated with the presence or absence of a particular human chromosome in the derivative cells to determine that the gene for that trait was located on that chromosome. An early example (in 1971) was mapping of the gene that encodes the enzyme thymidine kinase to chromosome 17.21 Molecular genetics came to gene mapping about 1980 and contributed to the field in 3 ways: (1) It provided DNA probes for analysis of somatic cell hybrids so that one could “go directly for the gene” and not require expression of the human gene in the hybrid cell. (2) It provided DNA probes (at first radioactive, later fluorescent) for in situ hybridization to chromosomes. This direct method for direct mapping was first made to work reliably for single-copy genes in 1981. 22 (3) Most importantly, molecular genetics provided an abundance of DNA markers that could be used for family linkage studies.23 Previously, such studies were seriously hampered by the pitifully small handful of available marker traits, ie, a few ©2001 American Medical Association. All rights reserved. Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005 THE ANATOMY OF THE HUMAN GENOME blood groups and serum or red blood cell proteins in which allelic variation could be demonstrated by immunologic, electrophoretic, or other methods. The abundant DNA markers first included restriction fragment length polymorphisms, followed by variable number tandem repeats, microsatellites or short tandem repeats, and, most recently, single-nucleotide polymorphisms. By 1985 when the HGP, as an initiative to sequence completely the DNA of the human genome, was first formally proposed, about 700 genes had been mapped to specific chromosomes and, for many of these genes, to specific regions of chromosomes. The genes mapped included those for blood groups, enzymes, clotting factors, structural proteins, and so on. Importantly, they also included the genes mutated in mystery diseases, so termed because at the time of mapping the nature of the basic defect was unknown. The usefulness of gene mapping to clinical medicine3 was particularly evident in connection with these disorders. It was the availability of an abundance of DNA markers for family linkage studies that advanced clinical application of gene mapping. The first mystery disease to be mapped through linkage to DNA markers was Huntington disease in 1983, located on the end of the short arm of chromosome 4.24 The clinical applicability of the information was immediately evident. By the linkage principle, one could now make diagnoses prenatally and premorbidly, provided that DNA of relatives was available for testing and DNA markers near the gene were found to be appropriately heterozygous in specific individuals. Considerable experience with the psychosocial problems surrounding predictive DNA testing then followed. The other useful application of gene mapping was for disease gene discovery through positional cloning.25 The procedure used was to identify markers (ideally flanking markers) linked to the disease locus to search the region for genes and to scrutinize those genes for a mutation that cosegregates in the family with the Mendelian disorder. Sometimes the gene for an enzyme or other protein had already been mapped to the region and, thus, was a candidate gene; in other cases, a previously unknown gene was found strictly by its location in the chromosomal region identified by linkage mapping of the genetic disorder. It took 10 years for the gene mutant in Huntington disease to be isolated by positional cloning, which occurred in 1993.26 This was partly because of genetic peculiarities of that region of 4p (ie, many genes and a relatively high rate of recombination) and particularly because it was a new type of mutation, an expanded trinucleotide repeat. The first 4 successes with positional cloning were chronic granulomatous disease27 and Duchenne muscular dystrophy28 in 1986 and retinoblastoma and cystic fibrosis29 in 1989. Cystic fibrosis was the first to be elucidated by positional cloning without the assistance of a cytogenetically visible deletion. In the decade that followed, mapbased gene discovery became a leading paradigm in biomedical research. All specialties of medicine used it to study some of their most puzzling disorders. Once the disease gene and its mutations were identified, specific DNA-based diagnostic tests could be designed. Furthermore, scientists were in a better position to determine pathogenetic mechanisms, the steps between gene and phene, ie, between genotype and phenotype. That information can often help investigators devise methods of intervention for treatment or secondary prevention. The Ultimate Anatomy: The Sequence of the Human Genome As noted earlier, a considerable number of genes had been mapped before the HGP was formally proposed. Furthermore, positional cloning for isolation of disease genes had been conceived, and proof of principle had been provided in 1986. DNA was discovered in the 19th century. That the genetic material is DNA ©2001 American Medical Association. All rights reserved. was first shown by Avery, McLeod, and McCarty in 1944 in pneumococcus.30 They found that the so-called transforming factor, which converted one pneumococcus form to another, is DNA. In 1953, Watson and Crick 7 deduced the double-helical structure of DNA from x-ray diffraction data. The genetic code of nucleotide triplets, each specifying a particular amino acid, was worked out in final detail in 1966. In the late 1960s, restriction enzymes, which cut DNA at specific sites and, thus, could be used as scalpels for the dissection of the genome, were discovered. In the early 1970s, it was found that genes (including those of humans) could be cloned in abundance by splicing DNA into a bacterial plasmid31 and growing the bacteria—the socalled recombinant DNA technology. In 1977, improved methods of DNA sequencing were reported by Maxam and Gilbert32 and by Sanger et al.33 Remarkably, the dideoxy method of Sanger et al remains the technologic cornerstone of the HGP; the method has been modified extensively with respect to automation and efficiency but remains fundamentally the same. The circular bacteriumlike chromosome of the cytoplasmic organelle, the mitochondrion, was completely sequenced, all 16 569 nucleotides, by Sanger’s group in 1981.34 Thus, some investigators were emboldened to propose sequencing the entire nuclear genome. An early proposal for complete sequencing, ie, the HGP, came from the US Department of Energy, which had responsibilities in the area of the mutational effects of radiation. Importance in the solution of problems of cancer was cited by Renato Dulbecco35 as the main reason to undertake the HGP. Indeed, genomics has had perhaps its greatest impact on cancer. Usefulness to the understanding of birth defects of mapping all the genes had been proposed earlier.36 In the 1920s, Haldane pointed out the usefulness of linkage in diagnosis, and in his “Croonian Lecture” in 1948 he wrote that the “final aim . . . should be the enumeration and location of all the genes found in normal human (Reprinted) JAMA, November 14, 2001—Vol 286, No. 18 Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005 2291 THE ANATOMY OF THE HUMAN GENOME beings.”37 The complete sequence was needed for finding all the genes. The HGP was discussed, debated, and planned between 1985 and 1990 and had its official start in the United States on October 1, 1990.38 A National Research Council/National Academy of Science (NRC/NAS) committee39 on mapping and sequencing the human genome was commissioned in late 1986 and reporting in February 1988 suggested that complete mapping and sequencing could be achieved in 10 to 15 years at a cost, in add-on funding, of about $200 million per year. In retrospect, this seems in some ways like a remarkably rash conclusion. Polymerase chain reaction was announced at a Cold Spring Harbor, NY, meeting in 1986,40 where the status of the gene map of Homo sapiens was reviewed41 and the HGP was discussed actively in a rump session. Yeast artificial chromosomes were invented in 1987, and bacterial artificial chromosomes and plasmid artificial chromosomes were introduced as other mechanisms for cloning large DNA segments later. The most polymorphic and, therefore, useful linkage markers, the microsatellites, were discovered in 1989 and the early 1990s. In the end, however, the estimates proved not far off. The NRC/NAS committee recommended “map first, sequence later”39 because the sequencing technology was not yet at an efficient and economical level and because the maps, both genetic (eg, of microsatellite markers42) and physical (eg, of yeast artificial chromosome clones43), would be useful to the final sequencing. Thus, the HGP of the National Institutes of Health adopted a top-down approach when it was initiated October 1, 1990. James D. Watson, PhD, was the first director; he was succeeded by Francis S. Collins, MD, PhD, in 1993. The sequencing was performed clone by clone after the construction of genetic and physical maps. Another recommendation of the NRC/NAS committee was that model organisms be studied in parallel with the human.39 Some of the most interesting and contributory parts of the HGP are the nonhuman genome projects, ie, those involving model organisms. Comparative genomics is a valuable way to gain understanding of the structure and function of the human genome and its genes. Expressed sequence tags (ESTs), that is, complementary DNA created from messenger RNA by reverse transcription, were developed in 199144 as a shortcut to the coding part of the human genome. Large EST databases for humans and many other species have been valuable to comparative genomics. The first free-living organism in which the genome was completely sequenced was the bacterium Haemophilus influenzae, with 1830137 nucleotides and about 1800 protein-coding genes. This sequence was determined by the group of J. Craig Venter, PhD,45 using a bottom-up approach. The DNA of the circular bacterial chromosome was broken into segments by shearing, the segments were cloned and sequenced at random, and the individual sequences were then assembled through recognition of identity at overlapping ends. Thereafter, the genomes of a considerable number of other microorganisms were sequenced by the same approach, including Helicobacter pylori, Mycobacterium tuberculosis, and Treponema pallidum. In 1996, baker’s yeast (Saccharomyces cerevisiae) was the first nucleated (eukaryotic) organism to be completely sequenced.46 The nematode Caenorhabditis elegans was the first multicellular organism to be completely sequenced, in 199847,48 and the complete sequence of the geneticists’ pet, Drosophila melanogaster, was reported in March 2000. 49 The first 2 were sequenced clone by clone and the third by a combination of the clone-by-clone and random (“shotgun”) methods. Following the success with random sequencing of clones, with subsequent assembly, in microorganisms, Venter and colleagues undertook the same in humans. They established a private company (Celera Genomics Inc) to work on a factory scale: DNA and clone preparation, sequencing, and assembly, all assisted by automation and 2292 JAMA, November 14, 2001—Vol 286, No. 18 (Reprinted) computerization. The approach was validated by the sequencing of Drosophila. The genomes of 5 specific humans representing 4 different ethnic backgrounds and both sexes were sequenced by Celera. 50 The publicly funded HGP rose to the competitive challenge and accelerated its sequencing, with a coordination of efforts in several laboratories in the United States, United Kingdom, Japan, France, Germany, and elsewhere, under the leadership of Francis Collins.51 The data generated by the publicly funded HGP are available in public databases free of charge. The sequence data generated by Celera collated with the publicly available data and with annotation as well as computer-based methods of analysis, are available to academic researchers by subscription and to pharmaceutical and other nonacademic laboratories at a substantially higher subscription rate. At the White House on June 26, 2000, Collins and Venter announced completion of initial public and private drafts of the human sequence. These were published in mid-February 2001— the publicly funded results from laboratories in the United States, United Kingdom, and elsewhere in Nature51 and the results of Celera in Science.50 In each journal, accompanying articles described some of the implications of the new information. When the complete human genome sequence was available, the total number of genes was only about half the number previously estimated52 and little more than twice the number in a much simpler organism such as C elegans. The increased complexity of the human, as compared with the worm, for example, is achieved by increasing the number of different proteins encoded by single genes, through alternative splicing of messenger RNA, posttranscriptional and posttranslational modifications, formation of heteromeric proteins (ie, proteins combining the products of 2 or more different genes), and so on. The estimated 30 000 to 40 000 genes encode more than 10 times that number of proteins. This has ©2001 American Medical Association. All rights reserved. Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005 THE ANATOMY OF THE HUMAN GENOME The Morbid Anatomy of the Human Genome: How Far Have We Come? The anatomic metaphor is useful because it extends to the comparative anatomy and evolution of the human genome, as well as to its functional anatomy, developmental anatomy, and, particularly in the medical context, to its morbid anatomy.2 Progress in the last 40 years in defining the morbid anatomy of the human genome is chronicled in MIM, a catalog of human genes and genetic disorders (FIGURE 2).12 Computerized since 1964, MIM was a pioneer in computer-based publication (eg, the first edition in 1966) and has been available online as OMIM since 1987. The periodic print editions, most recently the 12th, published in 3 volumes in 1998, represent serial cross-sections of the field of genetic medicine in the last 35 years. OMIM has the advantage of daily updating, ease of searching, and ease of linking to related sources of information, such as that on DNA and protein structure and that on the related biomedical literature; MIM has the archival advantage and that of accessibility in a nonelectronic setting, as well as ease of browsing. Throughout its history, MIM (and OMIM) has attempted comprehensive cataloging of gene mapping, especially any gene related to disease, and comprehensive cataloging of specific diseaserelated mutations. The number of genes with 1 or more disease-related mutations passed the 1000 mark about January 1, 2001.56 FIGURE 3A indicates the pace of disease-related gene identification during the last 20 years.53 Figure 3B indicates the pace at which specific genetic disorders have been characterized at the DNA level.53 The total number of characterized disorders (more than 1600) exceeds the number of disease-related genes (more than 1200) because many such genes are the site of mutations causing more than one distinct disorder, eg, the -globin gene, which is the site of mutations causing sickle cell disease, thalassemia, Heinz body hemolytic anemia, methemoglobinemia, erythremia, and so on. In part, the excess of distinct, molecularly defined disorders over the number of genes involved is a corollary of the “one gene, many proteins” phenomenon supported by the unexpectedly low total gene counts from analysis of the human genome sequence. The counts in Figure 3 include both germline (heritable) and somatic mutations. They do not include about 100 disease-related genes first identified as translocation-fusion partners in leukemias and some solid tumors. The counts do include some genes in which specific susceptibility or resistance alleles have been identified through association studies. Where Do We Go From Here? Clearly, there is much we don’t know, as reflected in the quotes: As the radius of knowledge gets longer, the circumference of the unknown expands even more. —Anonymous How is it that we know so little, given that we have so much information? —Noam Chomsky ©2001 American Medical Association. All rights reserved. Figure 2. Growth of Mendelian Inheritance in Man: A Catalog of Human Genes and Genetic Disorders (MIM)12 and Its Online Version (OMIM) 14 000 OMIM 13 083 October 21, 2001 13 000 12 000 11 000 10 000 No. of MIM Entries resulted in a partial shift of focus from the gene to proteins and from genomics to proteomics.53-55 Another interesting but not new finding in the sequences published in February 200150,51 is the nonuniform density of genes within chromosomes (genes tend to be concentrated at the ends of chromosome arms) and between chromosomes. Chromosomes 19 and 22 are particularly gene-rich and chromosomes 13, 18, and 21 are relatively gene-poor. Related to the latter observation may be the fact that chromosomes 13, 18, and 21 are involved in the only autosomal trisomies that are compatible with live birth. Before completion of the HGP, the information on variation in gene density was already known on the basis of numbers of genes mapped and indirectly on the basis of GC vs AT content; high GC correlates with high gene count. Nonetheless, precise confirmation by the HGP was useful. 9000 MIM12 8000 7000 MIM11 6000 5000 4000 3000 2000 1000 MIM10 MIM9 MIM8 MIM7 MIM6 MIM5 MIM4 MIM Editions MIM3 MIM2 OMIM MIM1 1965 1975 1985 1995 2005 Year Each entry is an essay on a particular phenotype (usually a disorder) or gene, with extensive bibliographic references and, in the case of OMIM, links to many other sources of information. We may soon have a complete catalog of the genes, but we do not know all of the protein products of all of those genes or the function of all of those proteins, or even a majority of them, in isolation, let alone in concert with others. We do not know the worldwide variation in the genes. We do not know the correlation between structural variation in the genes and variation in function as reflected in the phenotype. These matters will occupy biology and medicine for a long time to come. Progress in the HGP has brought several paradigm shifts that are relevant to the neo-Vesalian basis of medicine provided by this ultimate anatomy of the genome. A shift from genomics57 to proteomics,55 or at least an extension to proteomics, comes from the realization that several or even many different proteins can be encoded by a single gene, as must be the case to account for the increased complexity of humans as compared with C elegans and Drosophila, which have one third or one half as many genes as humans. A shift from mapbased gene discovery to sequence- (Reprinted) JAMA, November 14, 2001—Vol 286, No. 18 Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005 2293 THE ANATOMY OF THE HUMAN GENOME specific diagnoses at earlier stages. It will surely advance gene therapy. In a more general way, genomics is likely to render medicine more predictive and, therefore, more preventive. Comprehensive “genome screens” for recognition of individual susceptibilities to common disorders can be foreseen. Genomics-based clinical medicine will require that primary care physicians be competent in the interpretation of gene screens and in advising appropriate health measures. At the same time, on the traditional turf of clinical medicine, diagnosis will become more specific and precise, and treatment also more specific and safer. Genomics-based individualization of medical care aims to achieve the right treatment for the right patient.59 More precise characterization of the genome in common disorders, such as hypertension and mental illness, will identify diagnostic subtypes for which different therapies will be more effective. Cancer medicine is at the forefront in the use of genomics to match specific diagnosis with specific treatment. Morphologically indistinguishable neoplasms have been shown on “biopsy” of their altered genomes to be different, suggesting that different treatments are indicated. based gene discovery has occurred because availability of databases with expressed sequence tags or complete sequence information on many different organisms has made research in silico (cybergenomics) possible. A shift of emphasis from relatively rare single-gene disorders to common disorders of multifactorial causation (complex traits) has occurred, now that it is possible that susceptibility alleles that collaborate in causation can be identified. Such alleles may be found through studies of association between the particular disorder and the multitude of DNA markers, particularly single nucleotide polymorphisms, now available. As a consequence, DNA testing will no longer be limited to specific diagnosis of Mendelian disorders but can be extended to recognition of vulnerability or resistance to common disorders. The Neo-Vesalian Influence on 21st-Century Medicine As indicated in more detail elsewhere,58 the availability of the human genome sequence and information on proteomics related to the sequence is likely to change medicine in many ways. It will influence reproductive medicine, for example, permitting ever more Better understanding of individual genomic constitutions should permit drug therapy to get away from the one-sizefits-all approach. It should allow selection of drugs likely to be more effective in the treatment of a given disorder in a given individual. Even though a particular drug may be effective in the treatment of a disorder in a particular patient, the genomic constitution of the patient may place him or her at an increased risk of adverse drug reaction. Identifying such risks is part of picking the right treatment for the right patient and can reduce the very considerable toll of iatrogenic morbidity and mortality. As reflected in the large investments in genomics by pharmaceutical firms, genomics is anticipated to lead to identification of new drug targets, ie, genes and gene products involved in physiologic processes that can be enhanced or downregulated by custom designed drugs, pharmacogenomics. 5 9 Genomicsbased drug development should lead to entirely new medications for disorders not now treatable and to more effective and safer replacements for current drugs. In summary, chromosome analysis, gene mapping, and complete sequencing of the genome provide an anatomic basis for all aspects of clinical medicine. Figure 3. Pace of Disease Gene Discovery and Molecular Characterization of Clinical Disorders, 1981-2000 A No. of Genes Discovered With Disease-Related Mutations B No. of Clinical Disorders Characterized at the Molecular Level 175 200 (6) (12) (6) 150 150 125 (5) (11) 100 (8) No. No. (13) (8) 75 100 (8) (9) (6) 50 (2) 25 (1) (1) 50 (1) (1) 0 0 82 19 84 19 86 19 88 19 90 19 92 19 94 19 96 19 Year 98 19 00 20 82 19 84 19 86 19 88 19 90 19 92 19 94 19 96 19 98 19 00 20 Year Reprinted with permission from Peltonen L, McKusick VA. Dissecting human disease in the post-genomic era. Science. 2001;291:1224-1229.53 A, The number of disease genes discovered by the end of 2000 was 1112, including both germline and neoplasia-related somatic mutations. This number does not include all the genes identified as translocation gene-fusion partners in neoplastic disorders. Numbers in parentheses indicate genes with disease-related polymorphic alleles (susceptibility genes). B, The number of clinical disorders characterized by the end of 2000 was 1430. This number does not include the many neoplastic disorders caused by translocationrelated fusion genes. 2294 JAMA, November 14, 2001—Vol 286, No. 18 (Reprinted) ©2001 American Medical Association. All rights reserved. Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005 THE ANATOMY OF THE HUMAN GENOME As with any powerful new development, there are hazards of misuse of information or techniques. Privacy and confidentiality are issues of importance. A serious risk may be public misunderstanding that will prevent maximum benefit to be realized from the information. Blind fear of anything ge- netic is one problem. Another is the misconception of determinism, the idea that the phenotype is hard-wired to the genotype. Most genetic tests will provide an answer in probabilistic terms. It is useful for persons to understand the principles of probability in relation to their own health. The fact that the chance of a particular outcome is not 100% but some lower number means that there may be lifestyle, dietary, or medical measures that can be used to reduce the probability even further. Clearly, education of both the public and health care professionals is vital if the full benefits of neoVesalian medicine are to be realized. 21. Miller OJ, Allderdice PW, Miller DA. Human thymidine kinase gene locus: assignment to chromosome 17 in a hybrid of man and mouse cells. Science. 1971;173:244-245. 22. Harper ME, Ullrich A, Saunders GF. Localization of the human insulin gene to the distal end of the short arm of chromosome 11. Proc Natl Acad Sci U S A. 1981;78:4458-4460. 23. Botstein D, White RL, Skolnick M, Davis RW. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet. 1980;32:314-331. 24. Gusella JF, Wexler NS, Conneally PM, et al. A polymorphic DNA marker genetically linked to Huntington’s disease. Nature. 1983;306:234-238. 25. Collins FS. Positional cloning: let’s not call it reverse anymore. Nat Genet. 1992;1:3-6. 26. Huntington’s Disease Collaborative Research Group. A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington’s disease chromosomes. Cell. 1993;72:971-983. 27. Royer-Pokora B, Kunkel LM, Monaco AP, et al. Cloning the gene for an inherited human disorder— chronic granulomatous disease—on the basis of its chromosomal location. Nature. 1986;322:32-38. 28. Monaco AP, Neve RL, Colletti-Feener C, Bertelson CJ, Kurnit DM, Kunkel LM. Isolation of candidate cDNAs for portions of the Duchenne muscular dystrophy gene. Nature. 1986;323:646-650. 29. Rommens JM, Iannuzzi MC, Kerem B, et al. Identification of the cystic fibrosis gene: chromosome walking and jumping. Science. 1989;245:1059-1065. 30. Avery OT, MacLeod CM, McCarty M. Studies on the chemical nature of the substance inducing transformation of pneumococcal types: induction of transformation by a deoxyribonucleic acid fraction from Pneumococcus type III. J Exp Med. 1944;79:137-158. 31. Cohen S, Cang A, Boyer H, Helling R. Construction of biologically functional bacterial plasmids in vitro. Proc Natl Acad Sci U S A. 1973;70:3240-3244. 32. Maxam AM, Gilbert W. A new method for sequencing DNA. Proc Natl Acad Sci U S A. 1977;74: 1258-1262. 33. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A. 1977;74:5463-5467. 34. Anderson S, Bankier AT, Barrell BG, et al. Sequence and organization of the human mitochondrial genome. Nature. 1981;290:457-465. 35. Dulbecco R. A turning point in cancer research: sequencing the human genome. Science. 1986;231: 1055-1056. 36. McKusick VA. Prospects for progress. In: Fraser FC, McKusick VA, eds. Congenital Malformations. Vol 3. Amsterdam, the Netherlands: Excerpta Medica; 1970:407. 37. Haldane JB. The formal genetics of man. Proc R Soc B. 1948;135:147-170. 38. McKusick VA. Mapping and sequencing the human genome. N Engl J Med. 1989;320:910-915. 39. Alberts BM, Botstein D, Brenner S, et al. Report of the Committee on Mapping and Sequencing the Human Genome. Washington, DC: National Academy Press; 1988. 40. Mullis K, Faloona F, Scharf S. Specific enzymatic amplification of DNA in vitro: the polymerase chain reaction. Cold Spring Harb Symp Quant Biol. 1986; 51:265-273. 41. McKusick VA. The gene map of Homo sapiens: status and prospectus. Cold Spring Harb Symp Quant Biol. 1986;51:15-27, 1123-1208. 42. Dib C, Faure S, Fizames C, et al. A comprehensive genetic map of the human genome based on 5264 microsatellites. Nature. 1996;380:152-154. 43. Chumakov IM, Rigault P, Le Gall I, et al. YAC contig map of the human genome. Nature. 1995;377 (suppl):175-297. 44. Adams MD, Kelley JM, Gocayne JD, et al. Complementary DNA sequencing: expressed sequence tags and human genome project. Science. 1991;252:16511656. 45. Fleischmann RD, Adams MD, White O, et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995;269:496512. 46. Goffeau A, Barrell BG, Bussey H, et al. Life with 6000 genes. Science. 1996;274:546-567. 47. C elegans Sequencing Consortium. Genome sequence of the nematode C elegans: a platform for investigating biology. Science. 1998;282:2012-2018. 48. Chervitz SA, Aravinda L, Sherlock G, et al. Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science. 1998;282:20222028. 49. Adams MD, Celniker SE, Holt RA, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185-2195. 50. Venter JC, Adams MD, Myers EW, et al. The sequence of the human genome. Science. 2001;291: 1304-1351. 51. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature. 2001;409:860-921. 52. Field C, Adams MD, White O, Venter JC. How many genes in the human genome? Nat Genet. 1994; 7:345-346. 53. Peltonen L, McKusick VA. Dissecting human disease in the post-genomic era. Science. 2001;291: 1224-1229. 54. Wilkins MR, Pasquali C, Appel RD, et al. From proteins to proteomes: large scale protein identification by two-dimensional electrophoresis and amino acid analysis. Biotechnology. 1996;14:61-65. 55. Pandey A, Mann M. Proteomics to study genes and genomes. Nature. 2000;405:837-846. 56. Antonarakis SE, McKusick VA. OMIM passes the 1000 disease-gene mark. Nat Genet. 2000;25:11. 57. McKusick VA, Ruddle FH. A new discipline, a new name, a new journal. Genomics. 1987;1:1-2. 58. Collins FS, McKusick VA. Implications of the human genome project for medical science. JAMA. 2001; 285:540-544. 59. Roses AD. Pharamacogenomics and the practice of medicine. Nature. 2000;405:857-865. REFERENCES 1. McKusick VA. The anatomy of the human genome. Am J Med. 1980;69:267-276. 2. McKusick VA. The human genome through the eyes of Mercator and Vesalius. Trans Am Clin Climatol Assoc. 1981;42:66-90. 3. McKusick VA. The morbid anatomy of the human genome: a review of gene mapping in clinical medicine. Medicine. 1986;65:1-33, 1987;66:1-63, 1987; 66:237-206, 1988;67:1-19. 4. McKusick VA. History of medical genetics. In: Rimoin DL, Connor JM, Pyeritz RE, eds. EmeryRimoin Principles and Practice of Medical Genetics. 4th ed. Edinburgh, Scotland: Churchill Livingstone; 2001:3-36. 5. Tjio JH, Levan A. The chromosome number in man. Hereditas. 1956;42:1-6. 6. Ford CE, Hamerton JC. The chromosomes of man. Nature. 1956;178:1020-1023. 7. Watson JD, Crick FH. Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid. Nature. 1953;171:737-738. 8. Allen G, Benda CE, Böök JA, et al. “Mongolism.” Lancet. 1961;1:775. 9. Nowell PC, Hungerford DA. A minute chromosome in human chronic granulocytic leukemia. Science. 1960;132:1497. 10. McKusick VA. Marcella O’Grady Boveri (18651950) and the chromosome theory of cancer. J Med Genet. 1985;22:431-440. 11. Rowley JD. A new consistent abnormality in chronic myelogenous leukemia identified by quinacrine fluorescence and Giemsa staining. Nature. 1973; 243:290-293. 12. McKusick VA. Mendelian Inheritance in Man. Baltimore, Md: Johns Hopkins University Press; 1966. [Online Mendelian Inheritance in Man. 12th ed. 1998. Available at: http://www.ncbi.nlm.nih.gov/omim.] 13. Mitelman F. Catalog of Chromosome Aberrations in Cancer. 5th ed. New York, NY: Wiley-Liss; 1994. 14. Wilson EB. The sex chromosomes. Arch Mikrosk Anat Entwicklungsmech. 1911;77:249-271. 15. Earle P. On the inability to distinguish colors. Am J Med Sci. 1845;9:346-354. 16. Porter IH, Schulze J, McKusick VA. Genetical linkage between the loci for glucose-6-phosphate dehydrogenase deficiency and colour-blindness in American Negroes. Ann Hum Genet. 1962;16:107-122. 17. Ricciuti FC, Ruddle FH. Assignment of three gene loci (PGK, HGPRT, and G6PD) to the long arm of the human X-chromosome by somatic cell genetics. Genetics. 1973;74:661-678. 18. Donahue RP, Bias WB, Renwick JH, McKusick VA. Probable assignment of the Duffy blood group locus to chromosome 1 in man. Proc Natl Acad Sci U S A. 1968;61:949-955. 19. McKusick VA, Ruddle FH. The status of the gene map of the human chromosomes. Science. 1977;396: 390-405. 20. Weiss M, Green H. Human-mouse hybrid cell lines containing partial complements of human chromosomes and functioning human genes. Proc Natl Acad Sci U S A. 1967;58:1104-1111. ©2001 American Medical Association. All rights reserved. (Reprinted) JAMA, November 14, 2001—Vol 286, No. 18 Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005 2295