Download The Anatomy of the Human Genome

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extrachromosomal DNA wikipedia , lookup

Gene desert wikipedia , lookup

Copy-number variation wikipedia , lookup

Y chromosome wikipedia , lookup

Ridge (biology) wikipedia , lookup

Point mutation wikipedia , lookup

Human genetic variation wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Transposable element wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Neocentromere wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene therapy wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Gene expression programming wikipedia , lookup

Karyotype wikipedia , lookup

Chromosome wikipedia , lookup

Genomic imprinting wikipedia , lookup

Medical genetics wikipedia , lookup

Genetic engineering wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Metagenomics wikipedia , lookup

Oncogenomics wikipedia , lookup

Polyploid wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Non-coding DNA wikipedia , lookup

Gene expression profiling wikipedia , lookup

Whole genome sequencing wikipedia , lookup

X-inactivation wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

RNA-Seq wikipedia , lookup

Pathogenomics wikipedia , lookup

Gene wikipedia , lookup

Genomic library wikipedia , lookup

Helitron (biology) wikipedia , lookup

Minimal genome wikipedia , lookup

Human genome wikipedia , lookup

Public health genomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

History of genetic engineering wikipedia , lookup

Microevolution wikipedia , lookup

Human Genome Project wikipedia , lookup

Genome editing wikipedia , lookup

Genomics wikipedia , lookup

Genome evolution wikipedia , lookup

Genome (book) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Transcript
SPECIAL COMMUNICATION
The Anatomy of the Human Genome
A Neo-Vesalian Basis for Medicine in the 21st Century
Victor A. McKusick, MD
T
HE LINEAR ARRANGEMENT OF
genes on our chromosomes is
part of our microanatomy.
When we speak of mapping
genes on chromosomes, we use a cartographic metaphor. An equally appropriate anatomic metaphor is the anatomy
of the human genome.1-3 Clinical cytogenetics (starting in the late 1950s), mapping genes on chromosomes (beginning for autosomes in the late 1960s),
and comprehensive DNA sequencing of
the genome (initiated in the late 1980s)
have provided, in the words of Charles
Scriver, MDCM (oral communication,
1982), a neo-Vesalian basis for medicine. The influence on medicine is fully
as great as was that of Andreas Vesalius’
de corporis humani Fabrica, which was
published in 1543 and was the basis of
Harvey’s physiology of the circulation
(1628) and Morgagni’s morbid anatomy
(1761).
The history of medical genetics4 can
be discussed in 2 parts, the pre-1956
foundations of medical genetics going
back to Mendel and the greats of the
first half of the 20th century, and the
developments in the period since 1956,
during which medical genetics has
evolved into a full-fledged clinical and
academic field. The objective of this article is to trace the influence of chromosome studies, gene mapping, and
DNA sequencing (the Human Genome Project [HGP]) on the evolution of medical genetics since 1956.
Clinical Cytogenetics
(Chromosomology)
1956 was a watershed year in the history of medical genetics. In that year, the
Since 1956, the anatomy of the human genome has been described on the
basis of chromosome studies, gene mapping, and DNA sequencing. The gross
anatomy of Andreas Vesalius, published in 1543, played a leading role in
the development of modern medicine. The objective of this article is to show
that knowledge of genomic anatomy is having a comparably strong and pervasive influence on all of medicine. The research revealing human genome
anatomy is reviewed. The insight provided by genome anatomy has brought
about shifts of focus, both in research and in the clinic, eg, from genomics
to proteomic and from the individually rare, single-gene disorders to common disorders. Genomic anatomy permits medicine to become more predictive and preventive. At the same time, diagnosis and treatment are rendered more sensitive, specific, effective, and safe. Hazards in misuse and
misunderstanding of the information exist. Education of both the public and
health professionals is vital if the full benefits of neo-Vesalian medicine are
to be realized.
www.jama.com
JAMA. 2001;286:2289-2295
correct diploid chromosome number of
46 (not 48, as previously thought) was
established.5,6 It is remarkable that it was
not until 3 years after the determination of the double-helical structure of
DNA by Watson and Crick7 that the correct number of chromosomes in humans was determined. The advance was
significant to medicine, not because of
the specific numerology but because of
the associated simple improvements in
technique that made chromosome analysis feasible in the study of disease and
in clinical diagnosis. Medical genetics,
which really did not exist as a clinical
specialty before 1956, was given its own
organ, the nucleus, just as cardiology had
the heart, neurology the nervous system, etc.
Not only was the correct chromosome number established with the
improved techniques, but also, in
1959, Jerome Lejeune found the additional small chromosome underlying
©2001 American Medical Association. All rights reserved.
mongolism (mercifully renamed
Down syndrome)8 and others
described the numerical abnormalities
of the sex chromosomes in the Turner
and Klinefelter syndromes that same
year. In the early 1960s, abnormalities
in chromosome number and structure
were described in other congenital
malformation syndromes, such as trisomies 13 and 18, and in a variety of
translocations, deficiencies, mosaics,
and, in spontaneously aborted tissue,
triploidy. The finding of a rather
consistent chromosomal change in
Author Affiliation: McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of
Medicine, Baltimore, Md.
Disclosure: Dr McKusick was a member of the Program Advisory Committee for the National Institutes
of Health Human Genome Project 1989-1992 and is
a member of the Scientific Advisory Board of Celera
Genomics Inc.
Corresponding Author and Reprints: Victor A. McKusick, MD, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins Hospital, 600 N Wolfe St, Blalock
Bldg, Room 1007, Baltimore, MD 21287-4922 (e-mail:
[email protected]).
(Reprinted) JAMA, November 14, 2001—Vol 286, No. 18
Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005
2289
THE ANATOMY OF THE HUMAN GENOME
Figure 1. Progress in Mapping of Genes to
Specific Chromosomes Through the Early
Stages of the Human Genome Project
2500
X-Chromosomal
Autosomal
No. of Genes Mapped
2000
1500
1000
500
0
68 70 72 74 76 78 80 82 84 86 88 90
19 19 19 19 19 19 19 19 19 19 19 19
Year
chronic myeloid leukemia9 in 1960
provided an early confirmation of
Theodor Boveri’s chromosome theory
of cancer.10 Named for the city of residence of its discoverers and patients
following the practice of naming
hemoglobin variants, the Philadelphia
chromosome (Ph1) was thought to
represent a partially deleted chromosome 21. By improved chromosome
staining techniques, it was shown in
1973 that chromosome 22, not 21, is
involved and that the change that
causes the abnormally short Ph1 chromosome is not a deletion but rather a
reciprocal translocation between chromosomes 9 and 22.11
The ability to study the chromosomes in cultured cells in the amniotic fluid inaugurated the field of prenatal diagnosis of Down syndrome and
other chromosomal aberrations by amniocentesis beginning about 1966. The
characterization in cultured cells of enzyme deficiencies in inborn errors of
metabolism had also progressed to the
point that many of these could likewise be diagnosed prenatally by study
of amniotic fluid cells.
About 1970, various staining methods were developed that showed banding of chromosomes. The distinctive pattern of this banding permitted unique
identification of each chromosome.
When combined with methods for
studying the chromosomes at a stage of
cell division when they are extended,
banding methods made it possible to recognize small deletions and to interpret
chromosomal rearrangements. Correlation of the specifically interpreted
karyotype with phenotype led to the description of new microdeletion syndromes, such as Williams syndrome (online Mendelian Inheritance in Man
[OMIM] 104050), and the related concept (and designation) of contiguous
gene syndromes, eg, DiGeorge syndrome (OMIM 188400). 12 Furthermore, it allowed the large area of hematologic malignancies that result from
reciprocal translocations with creation
of fusion genes to be studied. The Philadelphia chromosome was the first of
these; the total number of examples is
now more than 100.13
In the last 20 years, molecular cytogenetics, “chromosome painting,” and
in situ hybridization for identification
of deletions and rearrangements are
only some of the methods used for characterizing the karyotype in clinical applications.
Gene Mapping
The first gene to be mapped to a specific chromosome in any species was
probably the one for colorblindness. In
1911, cytologist E. B. Wilson14 concluded that the characteristic pedigree
pattern of this trait, described by Pliny
Earle in Philadelphia, Pa, in 184515 and
by Friedrich Horner in Zurich, Switzerland, in 1876, was explained if the
trait is recessive, the gene is on the X
chromosome, and humans have a female-XX/male-XY sex chromosome
constitution. In the following decades, a host of disorders were deduced to be X-linked from the characteristic pedigree pattern, so that by
publication of the second edition of
Mendelian Inheritance in Man in 1968,12
the catalog of X-linked phenotypes had
68 asterisked (seemingly confirmed)
entries, each presumably related to a different gene. The precise localization on
the X chromosome of these genes was
not known; the first regional mapping
was for the linked genes for colorblindness and G6PD deficiency,16 shown to
2290 JAMA, November 14, 2001—Vol 286, No. 18 (Reprinted)
be on the distal end of the long arm of
the X in 1973.17
It was not until 1968, when 68 loci
were already known to be on the X
chromosome, that a gene was mapped
to a specific autosome, ie, the Duffy
blood group gene to chromosome 1.18
This was achieved by Roger Donahue,
then a Johns Hopkins University PhD
candidate in human genetics, through
a linkage study of a chromosome 1 heteromorphism (one chromosome 1 was
unusually long and appeared in the prebanding karyotypes to have an uncoiled region near the centromere) that
he had found in his own family.
Progress in gene mapping is shown
in FIGURE 1. The largest part of the
progress in the 1970s19 was through
study of interspecies somatic cell hybrids, particularly cells produced by fusing human and mouse cells.20 In cells
derived by cell division from such hybrid cells, the full set of mouse chromosomes are retained, whereas individual human chromosomes are lost
more or less at random. The presence
or absence of a particular human cell
trait could be correlated with the presence or absence of a particular human
chromosome in the derivative cells to
determine that the gene for that trait
was located on that chromosome. An
early example (in 1971) was mapping
of the gene that encodes the enzyme
thymidine kinase to chromosome 17.21
Molecular genetics came to gene
mapping about 1980 and contributed
to the field in 3 ways: (1) It provided
DNA probes for analysis of somatic cell
hybrids so that one could “go directly
for the gene” and not require expression of the human gene in the hybrid
cell. (2) It provided DNA probes (at first
radioactive, later fluorescent) for in situ
hybridization to chromosomes. This direct method for direct mapping was first
made to work reliably for single-copy
genes in 1981. 22 (3) Most importantly, molecular genetics provided an
abundance of DNA markers that could
be used for family linkage studies.23 Previously, such studies were seriously
hampered by the pitifully small handful of available marker traits, ie, a few
©2001 American Medical Association. All rights reserved.
Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005
THE ANATOMY OF THE HUMAN GENOME
blood groups and serum or red blood
cell proteins in which allelic variation
could be demonstrated by immunologic, electrophoretic, or other methods. The abundant DNA markers first
included restriction fragment length
polymorphisms, followed by variable
number tandem repeats, microsatellites or short tandem repeats, and, most
recently, single-nucleotide polymorphisms.
By 1985 when the HGP, as an initiative to sequence completely the DNA of
the human genome, was first formally
proposed, about 700 genes had been
mapped to specific chromosomes and,
for many of these genes, to specific regions of chromosomes. The genes
mapped included those for blood groups,
enzymes, clotting factors, structural proteins, and so on. Importantly, they also
included the genes mutated in mystery
diseases, so termed because at the time
of mapping the nature of the basic defect was unknown. The usefulness of
gene mapping to clinical medicine3 was
particularly evident in connection with
these disorders. It was the availability of
an abundance of DNA markers for family linkage studies that advanced clinical application of gene mapping.
The first mystery disease to be mapped
through linkage to DNA markers was
Huntington disease in 1983, located on
the end of the short arm of chromosome 4.24 The clinical applicability of the
information was immediately evident. By
the linkage principle, one could now
make diagnoses prenatally and premorbidly, provided that DNA of relatives
was available for testing and DNA markers near the gene were found to be
appropriately heterozygous in specific
individuals. Considerable experience
with the psychosocial problems surrounding predictive DNA testing then
followed.
The other useful application of gene
mapping was for disease gene discovery through positional cloning.25 The
procedure used was to identify markers (ideally flanking markers) linked to
the disease locus to search the region
for genes and to scrutinize those genes
for a mutation that cosegregates in the
family with the Mendelian disorder.
Sometimes the gene for an enzyme or
other protein had already been mapped
to the region and, thus, was a candidate gene; in other cases, a previously
unknown gene was found strictly by its
location in the chromosomal region
identified by linkage mapping of the genetic disorder.
It took 10 years for the gene mutant
in Huntington disease to be isolated by
positional cloning, which occurred in
1993.26 This was partly because of genetic peculiarities of that region of 4p
(ie, many genes and a relatively high rate
of recombination) and particularly because it was a new type of mutation, an
expanded trinucleotide repeat.
The first 4 successes with positional
cloning were chronic granulomatous
disease27 and Duchenne muscular dystrophy28 in 1986 and retinoblastoma
and cystic fibrosis29 in 1989. Cystic fibrosis was the first to be elucidated by
positional cloning without the assistance of a cytogenetically visible deletion. In the decade that followed, mapbased gene discovery became a leading
paradigm in biomedical research. All
specialties of medicine used it to study
some of their most puzzling disorders. Once the disease gene and its
mutations were identified, specific
DNA-based diagnostic tests could be designed. Furthermore, scientists were in
a better position to determine pathogenetic mechanisms, the steps between gene and phene, ie, between
genotype and phenotype. That information can often help investigators devise methods of intervention for treatment or secondary prevention.
The Ultimate Anatomy:
The Sequence of
the Human Genome
As noted earlier, a considerable number of genes had been mapped before
the HGP was formally proposed. Furthermore, positional cloning for isolation of disease genes had been conceived, and proof of principle had been
provided in 1986.
DNA was discovered in the 19th century. That the genetic material is DNA
©2001 American Medical Association. All rights reserved.
was first shown by Avery, McLeod, and
McCarty in 1944 in pneumococcus.30
They found that the so-called transforming factor, which converted one pneumococcus form to another, is DNA.
In 1953, Watson and Crick 7 deduced the double-helical structure of
DNA from x-ray diffraction data. The
genetic code of nucleotide triplets, each
specifying a particular amino acid, was
worked out in final detail in 1966. In
the late 1960s, restriction enzymes,
which cut DNA at specific sites and,
thus, could be used as scalpels for the
dissection of the genome, were discovered. In the early 1970s, it was found
that genes (including those of humans) could be cloned in abundance
by splicing DNA into a bacterial plasmid31 and growing the bacteria—the socalled recombinant DNA technology.
In 1977, improved methods of DNA
sequencing were reported by Maxam
and Gilbert32 and by Sanger et al.33 Remarkably, the dideoxy method of
Sanger et al remains the technologic
cornerstone of the HGP; the method has
been modified extensively with respect to automation and efficiency but
remains fundamentally the same.
The circular bacteriumlike chromosome of the cytoplasmic organelle, the
mitochondrion, was completely sequenced, all 16 569 nucleotides, by
Sanger’s group in 1981.34 Thus, some investigators were emboldened to propose sequencing the entire nuclear genome. An early proposal for complete
sequencing, ie, the HGP, came from the
US Department of Energy, which had responsibilities in the area of the mutational effects of radiation. Importance in
the solution of problems of cancer was
cited by Renato Dulbecco35 as the main
reason to undertake the HGP. Indeed,
genomics has had perhaps its greatest
impact on cancer. Usefulness to the understanding of birth defects of mapping all the genes had been proposed earlier.36 In the 1920s, Haldane pointed out
the usefulness of linkage in diagnosis,
and in his “Croonian Lecture” in 1948
he wrote that the “final aim . . . should
be the enumeration and location of all
the genes found in normal human
(Reprinted) JAMA, November 14, 2001—Vol 286, No. 18
Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005
2291
THE ANATOMY OF THE HUMAN GENOME
beings.”37 The complete sequence was
needed for finding all the genes.
The HGP was discussed, debated, and
planned between 1985 and 1990 and
had its official start in the United States
on October 1, 1990.38 A National Research Council/National Academy of
Science (NRC/NAS) committee39 on
mapping and sequencing the human genome was commissioned in late 1986
and reporting in February 1988 suggested that complete mapping and sequencing could be achieved in 10 to 15
years at a cost, in add-on funding, of
about $200 million per year. In retrospect, this seems in some ways like a
remarkably rash conclusion. Polymerase chain reaction was announced at a
Cold Spring Harbor, NY, meeting in
1986,40 where the status of the gene map
of Homo sapiens was reviewed41 and the
HGP was discussed actively in a rump
session. Yeast artificial chromosomes
were invented in 1987, and bacterial artificial chromosomes and plasmid artificial chromosomes were introduced
as other mechanisms for cloning large
DNA segments later. The most polymorphic and, therefore, useful linkage markers, the microsatellites, were
discovered in 1989 and the early 1990s.
In the end, however, the estimates
proved not far off.
The NRC/NAS committee recommended “map first, sequence later”39 because the sequencing technology was not
yet at an efficient and economical level
and because the maps, both genetic (eg,
of microsatellite markers42) and physical (eg, of yeast artificial chromosome
clones43), would be useful to the final sequencing. Thus, the HGP of the National Institutes of Health adopted a
top-down approach when it was initiated October 1, 1990. James D. Watson, PhD, was the first director; he was
succeeded by Francis S. Collins, MD,
PhD, in 1993. The sequencing was
performed clone by clone after the
construction of genetic and physical
maps. Another recommendation of the
NRC/NAS committee was that model organisms be studied in parallel with the
human.39 Some of the most interesting
and contributory parts of the HGP are
the nonhuman genome projects, ie,
those involving model organisms. Comparative genomics is a valuable way to
gain understanding of the structure and
function of the human genome and its
genes.
Expressed sequence tags (ESTs), that
is, complementary DNA created from
messenger RNA by reverse transcription, were developed in 199144 as a shortcut to the coding part of the human genome. Large EST databases for humans
and many other species have been valuable to comparative genomics.
The first free-living organism in which
the genome was completely sequenced
was the bacterium Haemophilus influenzae, with 1830137 nucleotides and about
1800 protein-coding genes. This sequence was determined by the group of
J. Craig Venter, PhD,45 using a bottom-up approach. The DNA of the circular bacterial chromosome was broken into segments by shearing, the
segments were cloned and sequenced at
random, and the individual sequences
were then assembled through recognition of identity at overlapping ends.
Thereafter, the genomes of a considerable number of other microorganisms
were sequenced by the same approach,
including Helicobacter pylori, Mycobacterium tuberculosis, and Treponema pallidum. In 1996, baker’s yeast (Saccharomyces cerevisiae) was the first nucleated
(eukaryotic) organism to be completely sequenced.46 The nematode Caenorhabditis elegans was the first multicellular organism to be completely
sequenced, in 199847,48 and the complete sequence of the geneticists’ pet, Drosophila melanogaster, was reported in
March 2000. 49 The first 2 were sequenced clone by clone and the third by
a combination of the clone-by-clone and
random (“shotgun”) methods.
Following the success with random
sequencing of clones, with subsequent assembly, in microorganisms,
Venter and colleagues undertook the
same in humans. They established a private company (Celera Genomics Inc)
to work on a factory scale: DNA and
clone preparation, sequencing, and assembly, all assisted by automation and
2292 JAMA, November 14, 2001—Vol 286, No. 18 (Reprinted)
computerization. The approach was
validated by the sequencing of Drosophila. The genomes of 5 specific humans representing 4 different ethnic
backgrounds and both sexes were sequenced by Celera. 50 The publicly
funded HGP rose to the competitive
challenge and accelerated its sequencing, with a coordination of efforts in several laboratories in the United States,
United Kingdom, Japan, France, Germany, and elsewhere, under the leadership of Francis Collins.51
The data generated by the publicly
funded HGP are available in public databases free of charge. The sequence
data generated by Celera collated with
the publicly available data and with annotation as well as computer-based
methods of analysis, are available to academic researchers by subscription and
to pharmaceutical and other nonacademic laboratories at a substantially
higher subscription rate.
At the White House on June 26, 2000,
Collins and Venter announced completion of initial public and private drafts
of the human sequence. These were
published in mid-February 2001—
the publicly funded results from laboratories in the United States, United
Kingdom, and elsewhere in Nature51
and the results of Celera in Science.50
In each journal, accompanying articles described some of the implications of the new information.
When the complete human genome
sequence was available, the total number of genes was only about half the
number previously estimated52 and little
more than twice the number in a much
simpler organism such as C elegans. The
increased complexity of the human, as
compared with the worm, for example, is achieved by increasing the
number of different proteins encoded
by single genes, through alternative
splicing of messenger RNA, posttranscriptional and posttranslational modifications, formation of heteromeric
proteins (ie, proteins combining the
products of 2 or more different genes),
and so on. The estimated 30 000 to
40 000 genes encode more than 10
times that number of proteins. This has
©2001 American Medical Association. All rights reserved.
Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005
THE ANATOMY OF THE HUMAN GENOME
The Morbid Anatomy
of the Human Genome:
How Far Have We Come?
The anatomic metaphor is useful because it extends to the comparative
anatomy and evolution of the human
genome, as well as to its functional
anatomy, developmental anatomy, and,
particularly in the medical context, to
its morbid anatomy.2
Progress in the last 40 years in defining the morbid anatomy of the human genome is chronicled in MIM, a
catalog of human genes and genetic disorders (FIGURE 2).12 Computerized
since 1964, MIM was a pioneer in computer-based publication (eg, the first
edition in 1966) and has been available online as OMIM since 1987. The
periodic print editions, most recently
the 12th, published in 3 volumes in
1998, represent serial cross-sections of
the field of genetic medicine in the last
35 years. OMIM has the advantage of
daily updating, ease of searching, and
ease of linking to related sources of information, such as that on DNA and
protein structure and that on the related biomedical literature; MIM has the
archival advantage and that of accessibility in a nonelectronic setting, as well
as ease of browsing.
Throughout its history, MIM (and
OMIM) has attempted comprehensive
cataloging of gene mapping, especially
any gene related to disease, and comprehensive cataloging of specific diseaserelated mutations. The number of genes
with 1 or more disease-related mutations passed the 1000 mark about January 1, 2001.56 FIGURE 3A indicates the
pace of disease-related gene identification during the last 20 years.53 Figure 3B
indicates the pace at which specific genetic disorders have been characterized at the DNA level.53 The total number of characterized disorders (more
than 1600) exceeds the number of disease-related genes (more than 1200) because many such genes are the site of
mutations causing more than one distinct disorder, eg, the ␤-globin gene,
which is the site of mutations causing
sickle cell disease, thalassemia, Heinz
body hemolytic anemia, methemoglobinemia, erythremia, and so on. In part,
the excess of distinct, molecularly defined disorders over the number of genes
involved is a corollary of the “one gene,
many proteins” phenomenon supported by the unexpectedly low total
gene counts from analysis of the human genome sequence.
The counts in Figure 3 include both
germline (heritable) and somatic mutations. They do not include about 100
disease-related genes first identified as
translocation-fusion partners in leukemias and some solid tumors. The counts
do include some genes in which specific susceptibility or resistance alleles
have been identified through association studies.
Where Do We Go From Here?
Clearly, there is much we don’t know,
as reflected in the quotes:
As the radius of knowledge gets longer, the
circumference of the unknown expands
even more.
—Anonymous
How is it that we know so little, given that
we have so much information?
—Noam Chomsky
©2001 American Medical Association. All rights reserved.
Figure 2. Growth of Mendelian Inheritance
in Man: A Catalog of Human Genes and
Genetic Disorders (MIM)12 and Its Online
Version (OMIM)
14 000
OMIM
13 083
October 21, 2001
13 000
12 000
11 000
10 000
No. of MIM Entries
resulted in a partial shift of focus from
the gene to proteins and from genomics to proteomics.53-55
Another interesting but not new finding in the sequences published in February 200150,51 is the nonuniform density of genes within chromosomes
(genes tend to be concentrated at the
ends of chromosome arms) and between chromosomes. Chromosomes 19
and 22 are particularly gene-rich and
chromosomes 13, 18, and 21 are relatively gene-poor. Related to the latter
observation may be the fact that chromosomes 13, 18, and 21 are involved
in the only autosomal trisomies that are
compatible with live birth. Before
completion of the HGP, the information on variation in gene density was
already known on the basis of numbers of genes mapped and indirectly on
the basis of GC vs AT content; high GC
correlates with high gene count. Nonetheless, precise confirmation by the
HGP was useful.
9000
MIM12
8000
7000
MIM11
6000
5000
4000
3000
2000
1000
MIM10
MIM9
MIM8
MIM7
MIM6
MIM5
MIM4
MIM Editions
MIM3
MIM2
OMIM
MIM1
1965
1975
1985
1995
2005
Year
Each entry is an essay on a particular phenotype (usually a disorder) or gene, with extensive bibliographic
references and, in the case of OMIM, links to many
other sources of information.
We may soon have a complete catalog
of the genes, but we do not know all of
the protein products of all of those
genes or the function of all of those proteins, or even a majority of them, in isolation, let alone in concert with others. We do not know the worldwide
variation in the genes. We do not know
the correlation between structural variation in the genes and variation in function as reflected in the phenotype. These
matters will occupy biology and medicine for a long time to come.
Progress in the HGP has brought several paradigm shifts that are relevant to
the neo-Vesalian basis of medicine provided by this ultimate anatomy of the genome. A shift from genomics57 to proteomics,55 or at least an extension to
proteomics, comes from the realization that several or even many different
proteins can be encoded by a single gene,
as must be the case to account for the
increased complexity of humans as compared with C elegans and Drosophila,
which have one third or one half as many
genes as humans. A shift from mapbased gene discovery to sequence-
(Reprinted) JAMA, November 14, 2001—Vol 286, No. 18
Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005
2293
THE ANATOMY OF THE HUMAN GENOME
specific diagnoses at earlier stages. It
will surely advance gene therapy. In a
more general way, genomics is likely to
render medicine more predictive and,
therefore, more preventive. Comprehensive “genome screens” for recognition of individual susceptibilities to
common disorders can be foreseen. Genomics-based clinical medicine will require that primary care physicians be
competent in the interpretation of gene
screens and in advising appropriate
health measures.
At the same time, on the traditional
turf of clinical medicine, diagnosis will
become more specific and precise, and
treatment also more specific and safer.
Genomics-based individualization of
medical care aims to achieve the right
treatment for the right patient.59 More
precise characterization of the genome in common disorders, such as hypertension and mental illness, will identify diagnostic subtypes for which
different therapies will be more effective. Cancer medicine is at the forefront in the use of genomics to match
specific diagnosis with specific treatment. Morphologically indistinguishable neoplasms have been shown on
“biopsy” of their altered genomes to be
different, suggesting that different treatments are indicated.
based gene discovery has occurred because availability of databases with
expressed sequence tags or complete sequence information on many different
organisms has made research in silico
(cybergenomics) possible.
A shift of emphasis from relatively
rare single-gene disorders to common
disorders of multifactorial causation
(complex traits) has occurred, now that
it is possible that susceptibility alleles
that collaborate in causation can be
identified. Such alleles may be found
through studies of association between the particular disorder and the
multitude of DNA markers, particularly single nucleotide polymorphisms, now available. As a consequence, DNA testing will no longer be
limited to specific diagnosis of Mendelian disorders but can be extended to
recognition of vulnerability or resistance to common disorders.
The Neo-Vesalian Influence
on 21st-Century Medicine
As indicated in more detail elsewhere,58 the availability of the human
genome sequence and information on
proteomics related to the sequence is
likely to change medicine in many ways.
It will influence reproductive medicine, for example, permitting ever more
Better understanding of individual genomic constitutions should permit drug
therapy to get away from the one-sizefits-all approach. It should allow selection of drugs likely to be more effective in the treatment of a given disorder
in a given individual. Even though a
particular drug may be effective in the
treatment of a disorder in a particular
patient, the genomic constitution of the
patient may place him or her at an increased risk of adverse drug reaction.
Identifying such risks is part of picking the right treatment for the right patient and can reduce the very considerable toll of iatrogenic morbidity and
mortality.
As reflected in the large investments
in genomics by pharmaceutical firms, genomics is anticipated to lead to identification of new drug targets, ie, genes and
gene products involved in physiologic
processes that can be enhanced or downregulated by custom designed drugs,
pharmacogenomics. 5 9 Genomicsbased drug development should lead to
entirely new medications for disorders
not now treatable and to more effective
and safer replacements for current drugs.
In summary, chromosome analysis,
gene mapping, and complete sequencing of the genome provide an anatomic
basis for all aspects of clinical medicine.
Figure 3. Pace of Disease Gene Discovery and Molecular Characterization of Clinical Disorders, 1981-2000
A No. of Genes Discovered With Disease-Related Mutations
B No. of Clinical Disorders Characterized at the Molecular Level
175
200
(6)
(12) (6)
150
150
125
(5) (11)
100
(8)
No.
No.
(13)
(8)
75
100
(8) (9)
(6)
50
(2)
25
(1)
(1)
50
(1) (1)
0
0
82
19
84
19
86
19
88
19
90
19
92
19
94
19
96
19
Year
98
19
00
20
82
19
84
19
86
19
88
19
90
19
92
19
94
19
96
19
98
19
00
20
Year
Reprinted with permission from Peltonen L, McKusick VA. Dissecting human disease in the post-genomic era. Science. 2001;291:1224-1229.53 A, The number of
disease genes discovered by the end of 2000 was 1112, including both germline and neoplasia-related somatic mutations. This number does not include all the genes
identified as translocation gene-fusion partners in neoplastic disorders. Numbers in parentheses indicate genes with disease-related polymorphic alleles (susceptibility
genes). B, The number of clinical disorders characterized by the end of 2000 was 1430. This number does not include the many neoplastic disorders caused by translocationrelated fusion genes.
2294 JAMA, November 14, 2001—Vol 286, No. 18 (Reprinted)
©2001 American Medical Association. All rights reserved.
Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005
THE ANATOMY OF THE HUMAN GENOME
As with any powerful new development, there are hazards of misuse of information or techniques. Privacy and
confidentiality are issues of importance. A serious risk may be public misunderstanding that will prevent maximum benefit to be realized from the
information. Blind fear of anything ge-
netic is one problem. Another is the misconception of determinism, the idea that
the phenotype is hard-wired to the genotype. Most genetic tests will provide an
answer in probabilistic terms. It is useful for persons to understand the principles of probability in relation to their
own health. The fact that the chance of
a particular outcome is not 100% but
some lower number means that there
may be lifestyle, dietary, or medical measures that can be used to reduce the probability even further. Clearly, education
of both the public and health care professionals is vital if the full benefits of neoVesalian medicine are to be realized.
21. Miller OJ, Allderdice PW, Miller DA. Human thymidine kinase gene locus: assignment to chromosome 17 in a hybrid of man and mouse cells. Science.
1971;173:244-245.
22. Harper ME, Ullrich A, Saunders GF. Localization
of the human insulin gene to the distal end of the short
arm of chromosome 11. Proc Natl Acad Sci U S A.
1981;78:4458-4460.
23. Botstein D, White RL, Skolnick M, Davis RW. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum
Genet. 1980;32:314-331.
24. Gusella JF, Wexler NS, Conneally PM, et al. A polymorphic DNA marker genetically linked to Huntington’s disease. Nature. 1983;306:234-238.
25. Collins FS. Positional cloning: let’s not call it reverse anymore. Nat Genet. 1992;1:3-6.
26. Huntington’s Disease Collaborative Research
Group. A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington’s
disease chromosomes. Cell. 1993;72:971-983.
27. Royer-Pokora B, Kunkel LM, Monaco AP, et al.
Cloning the gene for an inherited human disorder—
chronic granulomatous disease—on the basis of its
chromosomal location. Nature. 1986;322:32-38.
28. Monaco AP, Neve RL, Colletti-Feener C, Bertelson CJ, Kurnit DM, Kunkel LM. Isolation of candidate cDNAs for portions of the Duchenne muscular
dystrophy gene. Nature. 1986;323:646-650.
29. Rommens JM, Iannuzzi MC, Kerem B, et al. Identification of the cystic fibrosis gene: chromosome walking and jumping. Science. 1989;245:1059-1065.
30. Avery OT, MacLeod CM, McCarty M. Studies on
the chemical nature of the substance inducing transformation of pneumococcal types: induction of transformation by a deoxyribonucleic acid fraction from Pneumococcus type III. J Exp Med. 1944;79:137-158.
31. Cohen S, Cang A, Boyer H, Helling R. Construction of biologically functional bacterial plasmids in vitro.
Proc Natl Acad Sci U S A. 1973;70:3240-3244.
32. Maxam AM, Gilbert W. A new method for sequencing DNA. Proc Natl Acad Sci U S A. 1977;74:
1258-1262.
33. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad
Sci U S A. 1977;74:5463-5467.
34. Anderson S, Bankier AT, Barrell BG, et al. Sequence and organization of the human mitochondrial genome. Nature. 1981;290:457-465.
35. Dulbecco R. A turning point in cancer research:
sequencing the human genome. Science. 1986;231:
1055-1056.
36. McKusick VA. Prospects for progress. In: Fraser
FC, McKusick VA, eds. Congenital Malformations. Vol
3. Amsterdam, the Netherlands: Excerpta Medica;
1970:407.
37. Haldane JB. The formal genetics of man. Proc R
Soc B. 1948;135:147-170.
38. McKusick VA. Mapping and sequencing the human genome. N Engl J Med. 1989;320:910-915.
39. Alberts BM, Botstein D, Brenner S, et al. Report
of the Committee on Mapping and Sequencing the
Human Genome. Washington, DC: National Academy Press; 1988.
40. Mullis K, Faloona F, Scharf S. Specific enzymatic
amplification of DNA in vitro: the polymerase chain
reaction. Cold Spring Harb Symp Quant Biol. 1986;
51:265-273.
41. McKusick VA. The gene map of Homo sapiens:
status and prospectus. Cold Spring Harb Symp Quant
Biol. 1986;51:15-27, 1123-1208.
42. Dib C, Faure S, Fizames C, et al. A comprehensive genetic map of the human genome based on 5264
microsatellites. Nature. 1996;380:152-154.
43. Chumakov IM, Rigault P, Le Gall I, et al. YAC contig map of the human genome. Nature. 1995;377
(suppl):175-297.
44. Adams MD, Kelley JM, Gocayne JD, et al. Complementary DNA sequencing: expressed sequence tags
and human genome project. Science. 1991;252:16511656.
45. Fleischmann RD, Adams MD, White O, et al.
Whole-genome random sequencing and assembly of
Haemophilus influenzae Rd. Science. 1995;269:496512.
46. Goffeau A, Barrell BG, Bussey H, et al. Life with
6000 genes. Science. 1996;274:546-567.
47. C elegans Sequencing Consortium. Genome sequence of the nematode C elegans: a platform for investigating biology. Science. 1998;282:2012-2018.
48. Chervitz SA, Aravinda L, Sherlock G, et al. Comparison of the complete protein sets of worm and yeast:
orthology and divergence. Science. 1998;282:20222028.
49. Adams MD, Celniker SE, Holt RA, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185-2195.
50. Venter JC, Adams MD, Myers EW, et al. The sequence of the human genome. Science. 2001;291:
1304-1351.
51. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human
genome. Nature. 2001;409:860-921.
52. Field C, Adams MD, White O, Venter JC. How
many genes in the human genome? Nat Genet. 1994;
7:345-346.
53. Peltonen L, McKusick VA. Dissecting human disease in the post-genomic era. Science. 2001;291:
1224-1229.
54. Wilkins MR, Pasquali C, Appel RD, et al. From proteins to proteomes: large scale protein identification
by two-dimensional electrophoresis and amino acid
analysis. Biotechnology. 1996;14:61-65.
55. Pandey A, Mann M. Proteomics to study genes
and genomes. Nature. 2000;405:837-846.
56. Antonarakis SE, McKusick VA. OMIM passes the
1000 disease-gene mark. Nat Genet. 2000;25:11.
57. McKusick VA, Ruddle FH. A new discipline, a new
name, a new journal. Genomics. 1987;1:1-2.
58. Collins FS, McKusick VA. Implications of the human genome project for medical science. JAMA. 2001;
285:540-544.
59. Roses AD. Pharamacogenomics and the practice
of medicine. Nature. 2000;405:857-865.
REFERENCES
1. McKusick VA. The anatomy of the human genome. Am J Med. 1980;69:267-276.
2. McKusick VA. The human genome through the eyes
of Mercator and Vesalius. Trans Am Clin Climatol Assoc. 1981;42:66-90.
3. McKusick VA. The morbid anatomy of the human
genome: a review of gene mapping in clinical medicine. Medicine. 1986;65:1-33, 1987;66:1-63, 1987;
66:237-206, 1988;67:1-19.
4. McKusick VA. History of medical genetics. In:
Rimoin DL, Connor JM, Pyeritz RE, eds. EmeryRimoin Principles and Practice of Medical Genetics.
4th ed. Edinburgh, Scotland: Churchill Livingstone;
2001:3-36.
5. Tjio JH, Levan A. The chromosome number in man.
Hereditas. 1956;42:1-6.
6. Ford CE, Hamerton JC. The chromosomes of man.
Nature. 1956;178:1020-1023.
7. Watson JD, Crick FH. Molecular structure of nucleic
acids: a structure for deoxyribose nucleic acid. Nature. 1953;171:737-738.
8. Allen G, Benda CE, Böök JA, et al. “Mongolism.”
Lancet. 1961;1:775.
9. Nowell PC, Hungerford DA. A minute chromosome in human chronic granulocytic leukemia. Science. 1960;132:1497.
10. McKusick VA. Marcella O’Grady Boveri (18651950) and the chromosome theory of cancer. J Med
Genet. 1985;22:431-440.
11. Rowley JD. A new consistent abnormality in
chronic myelogenous leukemia identified by quinacrine fluorescence and Giemsa staining. Nature. 1973;
243:290-293.
12. McKusick VA. Mendelian Inheritance in Man. Baltimore, Md: Johns Hopkins University Press; 1966. [Online Mendelian Inheritance in Man. 12th ed. 1998.
Available at: http://www.ncbi.nlm.nih.gov/omim.]
13. Mitelman F. Catalog of Chromosome Aberrations
in Cancer. 5th ed. New York, NY: Wiley-Liss; 1994.
14. Wilson EB. The sex chromosomes. Arch Mikrosk
Anat Entwicklungsmech. 1911;77:249-271.
15. Earle P. On the inability to distinguish colors. Am
J Med Sci. 1845;9:346-354.
16. Porter IH, Schulze J, McKusick VA. Genetical linkage between the loci for glucose-6-phosphate dehydrogenase deficiency and colour-blindness in American Negroes. Ann Hum Genet. 1962;16:107-122.
17. Ricciuti FC, Ruddle FH. Assignment of three gene
loci (PGK, HGPRT, and G6PD) to the long arm of the
human X-chromosome by somatic cell genetics. Genetics. 1973;74:661-678.
18. Donahue RP, Bias WB, Renwick JH, McKusick VA.
Probable assignment of the Duffy blood group locus
to chromosome 1 in man. Proc Natl Acad Sci U S A.
1968;61:949-955.
19. McKusick VA, Ruddle FH. The status of the gene
map of the human chromosomes. Science. 1977;396:
390-405.
20. Weiss M, Green H. Human-mouse hybrid cell lines
containing partial complements of human chromosomes and functioning human genes. Proc Natl Acad
Sci U S A. 1967;58:1104-1111.
©2001 American Medical Association. All rights reserved.
(Reprinted) JAMA, November 14, 2001—Vol 286, No. 18
Downloaded from www.jama.com at Washington University - St Louis, on October 27, 2005
2295