Download Genes, Genomes, and Genomics Evelyn Fox Keller

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cell-free fetal DNA wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Gene expression programming wikipedia , lookup

Epigenomics wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Molecular cloning wikipedia , lookup

Point mutation wikipedia , lookup

Primary transcript wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Cancer epigenetics wikipedia , lookup

NUMT wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Ridge (biology) wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Oncogenomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Transposable element wikipedia , lookup

Metagenomics wikipedia , lookup

Gene expression profiling wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Epigenetics of human development wikipedia , lookup

RNA-Seq wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genetic engineering wikipedia , lookup

Pathogenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Gene wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genome (book) wikipedia , lookup

Designer baby wikipedia , lookup

Genomic library wikipedia , lookup

Human Genome Project wikipedia , lookup

Human genome wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Helitron (biology) wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genomics wikipedia , lookup

Minimal genome wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genome editing wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Genes, Genomes, and Genomics
Evelyn Fox Keller
Biological Theory
ISSN 1555-5542
Volume 6
Number 2
Biol Theory (2012) 6:132-140
DOI 10.1007/s13752-012-0014-x
1 23
Your article is protected by copyright and
all rights are held exclusively by Konrad
Lorenz Institute. This e-offprint is for personal
use only and shall not be self-archived in
electronic repositories. If you wish to selfarchive your work, please use the accepted
author’s version for posting to your own
website or your institution’s repository. You
may further deposit the accepted author’s
version on a funder’s repository at a funder’s
request, provided it is not made publicly
available until 12 months after publication.
1 23
Author's personal copy
Biol Theory (2011) 6:132–140
DOI 10.1007/s13752-012-0014-x
LONG ARTICLE
Genes, Genomes, and Genomics
Evelyn Fox Keller
Received: 8 January 2012 / Accepted: 19 January 2012 / Published online: 31 March 2012
Konrad Lorenz Institute for Evolution and Cognition Research 2012
Abstract While scientific terms lack the stability of
physical objects, they are generally far more stable than the
various meanings associated with them. As a consequence,
they tend to carry older conceptions alongside those more
recently acquired, thereby exerting an effective drag against
conceptual change. I illustrate this claim with an analysis of
the shifting meanings of the term genome, originally used to
refer to a collectivity of genes, but more recently to an
organism’s complement of DNA. While genes were originally regarded as effectively autonomous formal agents,
and DNA as collections of genes, contemporary research
suggests that an organism’s DNA constitutes a far more
complex system designed to adapt and respond to the
environment in which it finds itself.
Keywords DNA Genes Genome ncRNA Reactive system Regulation
It has been frequently observed that, unlike DNA, genes
are not physical objects but ‘‘merely concepts’’—things
that have no fixed existence, and about which there can be
no single fact of the matter. Indeed, the multiple and
shifting meanings of this term have been the subject of
extensive discussion by biologists as well as by historians
and philosophers of biology (see, e.g., Beurton et al. 2000;
Keller 2000; Snyder and Mark Gerstein 2003; Pearson
2007). But what is less likely to be noted is that the word
itself has demonstrated remarkable endurance since its
coining in 1909. To be sure, it lacks the stability of a
E. F. Keller (&)
Program in Science Technology and Society, Massachusetts
Institute of Technology, Cambridge, MA, USA
e-mail: [email protected]
123
physical object, but my point here is that it has exhibited
far greater stability than the meanings associated with it.
I argue that the endurance of words, especially in the face
of shifting meanings, has important consequences for how
we think about the phenomena these words were intended
to describe. The focus of the present essay is not however
on the term gene, but rather on the genome, a term introduced only 11 years after gene.
I will use an exploration of the changing meanings of
genome to illustrate a number of general observations
about the role of language in scientific thought, and more
specifically, about the role of words as vehicles of resistance to conceptual change. These observations—developed by a variety of authors but perhaps most notably by
Polanyi (1958)—can be formulated as follows:
(1)
(2)
1
Scientific terms (words) often endure for long periods, but because the meanings of these terms depend
on the particular context of research, they undergo
constant change. As Michael Polanyi long ago
observed,1 such changes in meaning occur for the
most part outside of ‘‘focal awareness.’’ For this
reason, they tend to evade explicit marking, allowing
older meanings to persist alongside newer meanings,
even as a term is used by a single author; indeed, even
in the same text.
Polanyi also observed that ‘‘Different languages …
sustain alternative conceptual frameworks, interpreting
all things that can be talked about in terms of somewhat
different allegedly recurrent features’’ (ibid). The point
I wish to stress, however, is that, just as different
He wrote: ‘‘The meaning of speech thus keeps changing in the act
of groping for words without our ever being focally aware of the
change, and our groping invests words in this manner with a fund of
unspecifiable connotations’’ (1958, p. 112).
Author's personal copy
Genes, Genomes, and Genomics
(3)
languages sustain different conceptual frameworks, so
too do different meanings of the same terms. In effect,
different meanings of scientific terms constitute different scientific languages in and of themselves.
Through the persistence of older meanings, words
carry past conceptual frameworks into the present and
future, sustaining the expectations and formulations
that had originally generated these earlier meanings
and infusing them into interpretations of new data.
In so doing, they constitute vehicles of resistance to
conceptual change, imposing an earlier generation’s
‘‘particular theory of the nature of things’’2 onto a
current generation’s perspective.
What is a Genome?
At least four different meanings are commonly invoked in
definitions of the genome, two of them inherited from the
era of classical genetics (one referring to genes and the other
to chromosomes), the third following the characterization of
DNA as the material basis of inheritance, and a fourth
which at least tacitly opens a question about the relation
between genes and genetic. Of particular interest here will
be the relation between genomes and genes. Given its history, it is inevitable that our understanding of what a genome is has been (and remains) deeply entwined with current
and prior conceptions of the gene. Furthermore, given the
variability of those conceptions, the relation between the
two terms will inevitably prove correspondingly unstable.
Yet, however variable, I will argue that that relation is itself
of interest; it is also of consequence—not only for how we
talk, but also for how we think. Especially, I suggest that the
view—in effect, the default view—of genomes as collections of genes (however these are conceived) constrains our
ability to conceptually adapt to many of the more radical
findings of genomic research. Recent work has posed acute
problems for so simple an understanding of the relation
between genes and genomes, but of a kind that are often
masked by the fluidity of usage of the former term, coupled
with its persistent conjunction with the latter.
Usage and Meanings of the Term Genome, 1920–1990
Like others, Joshua Lederberg and Alexa McCray (2001)
attribute the coining of the term to Hans Winkler who, in
2
As Polanyi put it, ‘‘In learning to speak, every child accepts a
culture constructed on the premises of the traditional interpretation of
the universe, rooted in the idiom of the group to which it was born,
and every intellectual effort of the educated mind will be made within
this frame of reference’’ (ibid).
133
1920, combined the word gene with the part of the word
chromosome (–ome) that signifies the collectivity of units
to form a new word. The new word, Winkler wrote, refers
to ‘‘the haploid chromosome set, which, together with the
pertinent protoplasm, specifies the material foundations of
the species’’ (quoted in Lederberg and McCray 2001,
p. 8). It was not much used until the early to mid 1960s
(see Fig. 1), but when it was employed, it was generally
taken (often without definition) as referring simultaneously to an organism’s complement of genes and to its
defining set of chromosomes, the tacit assumption being
that the two were equivalent—i.e., as if the chromosomes
consisted of nothing (or at least nothing of significance)
other than the complement of genes. Thus, e.g., in their
widely used genetics textbook, Edgar Srb and his colleagues write: ‘‘Among organisms with chromosomes,
each species has a characteristic set of genes, or genome.
In diploids a genome is found in each normal gamete.
It consists of a full set of the different kinds of chromosomes’’ (1965, p. 190). Indeed, it might be argued that the
distinction between a complement of genes (whatever they
might be) and a full set of chromosomes was not experimentally accessible in the context of classical genetics.
It might even be argued that it was correspondingly
inaccessible conceptually; certainly, history suggests that
it was not (or at least not generally) conceptually accessed. In later years, however—and especially after the
advent of molecular biology—that distinction did become
available to both conceptual and experimental scrutiny.
As techniques for distinguishing between DNA and the
various lipids and proteins that contribute to the composition of chromosomes developed, the conceptual difference between genes and chromosomes inevitably became
more conspicuous as well. In retrospect, one might think
that it would accordingly come to demand explicit
marking. Apparently not so, but I at least want to do so,
referring to the complement of genes as genome-1, and of
chromosomes, as genome-2.
By the 1960s, after DNA had been established as providing the material basis of genetics, and after biologists
began to shift their conception of genes from bits of
chromosomes to bits of DNA, the genome also acquired a
new (and third) meaning: an organism’s complete set of
DNA. Let me call this genome-3. As can be seen from
Fig. 1, overall use of the term rises significantly after 1960.
Figure 2 shows its usage growing steadily (albeit not yet
dramatically) throughout the 1970s and 1980s.
Genes, Genomes and Junk DNA
The interpretation of genomes in terms of DNA comes
increasingly into play during this period, even while earlier
123
Author's personal copy
134
E. F. Keller
Fig. 1 Number of articles
referring to the term genome
in the biological literature,
1934–1970 (from the Web
of Science)
Fig. 2 Number of articles referring to genome in the biological
literature, 1971–1990 (from the Web of Science)
definitions continue to persist. Furthermore, and adding yet
more complication, research in molecular biology soon led
to yet another definition, namely the genome as ‘‘all of a
living thing’s genetic material.’’ I call this genome-4.
The difference between genes and genetic material (information/instructions)—i.e., between genome-1 and genome4—might not have been meaningful in the days when
genetics referred simply to the science of genes. But from the
1970s on, especially as the focus of molecular genetics
shifted to the study of eukaryotic organisms, and as the study
of regulation assumed increasing centrality to that science,
the relation between genes and genetics has come to seem
less straightforward. To the extent that regulation is a
123
property of DNA, it is surely genetic, but is it always
attributable to genes?3
A somewhat different challenge to the equivalence
between genes and genetic material might also have been
provoked by a series of findings, beginning in the late 1960s,
prompting the recognition of substantial expanses of nonprotein-coding (‘‘nongenic’’ or ‘‘extra’’) DNA sequences in
eukaryotic genomes. Perhaps most important were the findings (1) of large amounts of repetitive DNA in 1968; (2) of the
wildly varying relationship between the amount of DNA in an
organism and its complexity (the ‘‘C-value paradox’’; Thomas
1971); and (3) of split genes (protein-coding sequences
interrupted by non-coding ‘‘introns’’) in 1977. But the designation of such DNA as ‘‘junk’’—a term first introduced by
Susumu Ohno (1972) to refer to the rampant degeneracy (nongenic and, as he thought, non-transcribed sequences) in
eukaryotic DNA—soon blunted that challenge. After the
appearance in 1980 of two back-to-back papers (Doolittle and
Sapienza 1980; Orgel and Crick 1980) that linked ‘‘junk’’
DNA to Richard Dawkins’ (1976) notion of ‘‘selfish’’ DNA,
the idea of ‘‘junk DNA’’ seemed to become entrenched.
Orgel and Crick begin by writing (1980, p. 604):
A piece of selfish DNA, in its purest form, has two
distinct properties:
(1)
(2)
3
It arises when a DNA sequence spreads by forming
additional copies of itself within the genome.
It makes no specific contribution to the phenotype.
Although not directly germane to the main theme of this essay, it
should be noted that a need to distinguish between genome-3 and
genome-4 has also been stimulated by current research, especially by
the growing appreciation of alternative forms of inheritance (and
hence, of alternate meanings of genetic) that arises out of contemporary work on epigenetics.
Author's personal copy
Genes, Genomes, and Genomics
But the authors want to use the term
in a wider sense, so that it can refer not only to
obviously repetitive DNA but also to certain other
DNA sequences that appear to have little or no
function, such as much of the DNA in the introns of
genes and parts of the DNA sequences between genes
…. The conviction has been growing that much of
this extra DNA is ‘‘junk,’’ in other words, that it has
little specificity and conveys little or no selective
advantage to the organism … in the case of selfish
DNA, the sequence which spreads makes no contribution to the phenotype of the organism, except
insofar as it is a slight burden to the cell that contains
it. (1980, p. 604)
Until the early 1990s, the assumption that the large
amounts of non-coding DNA found in eukaryotic organisms had ‘‘little or no function,’’ contributed nothing to
their phenotype, and could therefore be ignored, remained
relatively uncontested. For all practical purposes, genomes
(or at least the interesting parts of genomes) could still
be thought of as collections of genes. Indeed, when the
Human Genome Project (HGP) first announced its intention to sequence the entire human genome, much of the
opposition to that proposal was premised on this assumption. Thus, e.g., Bernard Davis of the Harvard Medical
School complained that ‘‘blind sequencing of the genome
can also lead to the discovery of new genes …, but this
would not be an efficient process. On average, it would be
necessary to plow through 1–2 million ‘junk’ bases before
encountering an interesting sequence’’ (Davis 1990,
p. 343). And in a similar vein, Robert Weinberg of MIT
argued:
The sticky issue arises at the next stage of the project,
its second goal, which will involve determining
the entire DNA sequence of our own genome and
those of several others. Here one might indeed raise
questions, as it is not obvious how useful most of
this information will be to anyone. This issue arises
because upwards of 95 % of our genome contains
sequence blocks that seem to carry little if any biological information. As is well known, our genome is
riddled with vast numbers of repetitive sequences,
pseudogenes, introns, and intergenic segments. These
DNA segments all evolve rapidly, apparently because
their sequence content has little or no effect on phenotype. Some of the sequence information contained
in them may be of interest to those studying the
recent and distant evolutionary precursors of our
modern genome. But in large part, this vast genetic
desert holds little promise of yielding many gems. As
more and more genes are isolated and sequenced, the
135
arguments that this junk DNA will yield great surprises become less and less persuasive. (Weinberg
1991, p. 78)
Weinberg was soon proven wrong. In the mid-1990s,
and largely as a result of research enabled by large-scale
DNA sequencing, confidence that non-coding DNA is nonfunctional, that it could be regarded as junk, began an ever
more rapid decline; and by the dawn of the new century,
few authors (not even Weinberg) still saw the equation
between non-coding DNA and junk as viable.
Genes, Genomes, and Genomics, 1990 to the Present
The launching of the HGP in 1990 was almost certainly the
most significant moment in the entire history of the term
genome. From that point on, usage of the term explodes,
jumping from 587 references per year in 1989 to 871 in
1990, and 3,777 in 1991. By 2002, the number passes
10,000, and by 2010, exceeds 21,000 (see Fig. 3).
The numbers displayed in Fig. 3 are telling. In particular, they point to a critical transformation in the history of
genetics that was in good part triggered by the HGP. The
explosive growth over these last two decades in use of the
term genome in the professional literature reflects not only
the rise of a new science of genomics, but also, the growing
incompatibilities between its concepts, methods, and
ontologies and those of the earlier science of genetics. As
Barry Barnes and John Dupré describe that transition in
their recent book, Genomes and What to Make of Them, it
‘‘involves genomes rather than genes being treated as real,
Fig. 3 Number of articles referring to genome in the biological
literature, 1989–2010 (from the Web of Science)
123
Author's personal copy
136
and systems of interacting molecules rather than sets of
discrete particles becoming the assumed underlying objects
of research’’ (2008, p. 8).
Yet genes have hardly disappeared from our thinking
about genomes. To be sure, it has become difficult to find a
definition of the gene upon which researchers can agree,
and slippage between different understandings of the
term has become endemic. Nevertheless, gene talk remains
prominent in discussions of genomes, and what one means
by the former term inevitably effects how one understands
what the genome is, and about what it does. There does
however seem to be a default response that, despite the
variability in definitions of the gene, continues routinely to
be offered whenever a definition is demanded: the gene is a
protein-coding sequence. And indeed, it is precisely in
relation to this default definition that research in molecular
genomics has proven such a challenge. I briefly referred to
this challenge in my earlier discussion of the rise and fall of
the concept of junk DNA, there suggesting that the demise
of that concept after the mid-1990s can be directly attributed to the growing impact of genomics research. Here
I want to elaborate on that suggestion.
Of particular importance were the findings, first, that
much if not most of the non-coding DNA in eukaryotic
organisms is in fact transcribed; second, that a surprising
range of regulatory functions (by 2001, these include RNA
interference, co-suppression, transgene silencing, imprinting, and methylation) can be attributed to what by this time
had come to be called non-coding RNA (ncRNA); and
third, that the human genome contains far fewer genes than
had been anticipated. To some, these findings called for a
radical reconceptualization of the nature of genetic regulation, and especially, of the role of RNA in the architecture of complexity in higher organisms; others focused on
the unduly restrictive definition of genes as protein coding
sequences of DNA,4 and sought to cast ncRNA sequences
as other kinds of ‘‘genes’’ (see, e.g., Eddy 2001). In any
case, all of these developments clearly challenged earlier
equations between protein coding sequences and function
(or more precisely, between non-coding sequences and
non-function).
In 2003, a new public research consortium was launched
by the US National Human Genome Research Institute
(NHGRI) with the explicit goal of finding all the functional
elements in the human genome. It was called ENCODE
(the ENCyclopedia Of DNA Elements), and the results of
4
According to John Mattick, e.g., ‘‘The failure to recognize the
possible significance of these RNAs is based on the central dogma, as
determined from bacterial molecular genetics, that genes are synonymous with proteins, and that RNAs are just temporary reflections of
this information’’ (2001, p. 987).
123
E. F. Keller
the first phase—the detailed analysis of 1 % of the human
genome—were reported in The ENCODE Project Consortium (2007). They confirmed that the human genome is
‘‘pervasively transcribed’’ (apparently in a developmentally
regulated way); that most of the transcripts are non-coding;
that, on the one hand, regulatory sequences may overlap
protein coding sequences; on the other hand, they are also
often far removed from coding sequences. Perhaps the
biggest surprise was that functional sequences need not be
under evolutionary constraint.
The reaction was swift. In his ‘‘News and Views’’
commentary on the report, John Greally wrote,
We usually think of the functional sequences in the
genome solely in terms of genes, the sequences
transcribed to messenger RNA to generate proteins.
This perception is really the result of effective publicity by the genes, who take all of the credit … They
have even managed to have the entire DNA sequence
referred to as the ‘genome’, as if the collective
importance of genes is all you need to know about the
DNA in a cell … Even this preliminary study reveals
that the genome is much more than a mere vehicle for
genes, and sheds light on the extensive molecular
decision-making that takes place before a gene is
expressed. (2007, p. 783)
Over the last few years, surprises about the range and
extent of ncRNA involvement in molecular decision
making have only accelerated. John Mattick (2010b) puts
their role in the modulation of chromatin architecture and
epigenetic memory at the top of the list. These are processes crucial not only to embryogenesis but also to brain
development, plasticity, learning, and, more generally, to
regulating the impact of environmental signals. NcRNA
transcripts have been shown to be critical to the regulation
not only of transcription (both cis and trans driven), but
also of alternative splicing, of chromosome dynamics, and
of epigenetic memory. Finally, and perhaps most revolutionary, are the findings that implicate ncRNA in the
editing of RNA transcripts, thereby modulating the configuration of the regulatory networks these transcripts form.
Behavioral (or physiological) adaptation does not
require that environments can directly alter DNA sequence:
environmental signals trigger a wide range of signal
transduction cascades that lead to short-term adaptation.
But mechanisms for editing RNA sequences that in turn
regulate gene expression may provide the most dramatic
case yet for environmental adaptation. Moreover, epigenetic mechanisms (also mediated by ncRNA) begin to
dissolve the distinction between short-term and long-term
adaptations by lending to such adaptations the possibility
of being transmitted inter-generationally. As Mattick
explains, ‘‘the ability to edit RNA … suggests that not only
Author's personal copy
Genes, Genomes, and Genomics
proteins but also—and perhaps more importantly—regulatory sequences can be modulated in response to external
signals and that this information may feedback via RNAdirected chromatin modifications into epigenetic memory’’
(2010a, p. 551).
Finally, environmental signals are not restricted to the
simple physical and chemical stimuli that directly impinge
on the skin: organisms with central nervous systems have
receptors for forms of perception that are both far more
complex and longer range. Humans have especially
sophisticated perceptual capacities, enabling them to
respond to a wide range of complex visual, auditory, linguistic, and behavioral/emotional signals in their extended
environment. Apparently, as researchers have recently
begun to demonstrate, responses to such signals can extend
all the way down to the level of gene expression. For
instance, in 2007 Steve Cole and his colleagues compared
the gene expression patterns in the leukocytes of those who
felt socially isolated with the expression patterns of those
who felt connected to others, and were able to demonstrate
systematic difference in the expression of roughly 1 % of
the genes assayed (Cole et al. 2007). Subsequent studies
have correlated gene expression patterns with other social
indicators (e.g., socioeconomic status), providing further
evidence for mechanisms of ‘‘social signal transduction’’
that reach all the way down to the level of the genome.
Cole explains that, ‘‘Socio-environmental processes regulate human gene expression by activating central nervous
system processes that subsequently influence hormone and
neurotransmitter activity in the periphery of the body.
Peripheral signaling molecules interact with cellular
receptors to activate transcription factors’’ (2009, p. 133).
137
if the meaning of genetic material (information or
instructions) could still be reduced to a set of discrete and
autonomous units called genes. Indeed, a review of the
major current scientific glossaries suggests that all four
meanings of the term continue in common usage to the
present day, in a pattern of coexistence that can be readily
observed from the list of definitions shown in Table 1.
(It should be noted that the glossaries cited are intended for
the larger professional and non-professional community;
they are organized in an effort to educate and inform nontechnical readers of the meanings of these terms as
employed in the technical literature.)
Thus, for example, the Oxford Dictionary of Genetics
happily defines the genome as ‘‘the total DNA’’ in a
chromosome set [genome-3] or ‘‘all of the genes carried by
… this chromosome set’’; the Craig Venter Institute defines
it, alternately, as ‘‘all of a living thing’s genetic material,’’
‘‘the entire set of hereditary instructions for building,
running, and maintaining an organism, and passing life on
to the next generation’’ (genome-4), ‘‘A collection of
genes’’ (genome-1), and ‘‘all the DNA in a cell’’ (genome3). The Science Primer provided by the NIH similarly
confounds these various definitions, telling us that a
genome
contains all of the biological information needed to
build and maintain a living example of that organism
[genome-4]. The biological information contained in
a genome is encoded in its deoxyribonucleic acid
(DNA) [genome-3] and is divided into discrete units
called genes [genome-1]. Genes code for proteins that
attach to the genome at the appropriate positions and
switch on a series of reactions called gene expression.
What is a Genome Today?
Old and New Conceptual Frameworks
As experts in the field now generally recognize, the gap
between a full complement of protein-coding sequences
and the genetic material (or DNA) of an organism is in fact
huge: e.g., only 1.2 % of human DNA is currently estimated to be devoted to protein-coding sequences. Yet even
in the face of so glaring a disparity, even with the growing
recognition of how complex and multi-layered are the
regulatory processes enabled by the genome, that entity
continues to be understood as a collection of genes that are
themselves impervious to environmental input. To be sure,
other definitions of the genome have been added since the
term was first introduced (genome-3 and genome-4), but
neither of these has quite replaced the original definition
(genome-1). Instead, the more recent definitions appear to
have simply adjoined to the others, as if the genome can
still be understood as indiscriminately referring to the
organism’s totality of genes and its totality of DNA, and as
What is most immediately striking here is not only the
persistence of older usages of the term, maintained
alongside meanings corresponding to more recent experimental practices, but also, through that persistence, the
perpetuation of conceptual frameworks to which the older
meanings were attached. Thus, e.g., an understanding of
the genome as an organism’s totality of genes recalls the
classical discourse of gene action and Alfred H. Sturtevant’s reformulation of embryology’s question of how an
egg develops into a complex many-celled organism as one
of ‘‘how genes produce their effects.’’ As Sturtevant wrote,
‘‘in most cases there is a chain of reaction between the
direct activity of a gene and the end-product that the
geneticist deals with as a character’’ (1932, p. 307).
Molecular biology succeeded in unpacking that ‘‘chain of
reaction’’: genes were identified as DNA sequences, their
123
Author's personal copy
138
E. F. Keller
Table 1 Definitions of genome obtained from a range of scientific glossaries (both official and semi-official)
Source
Definitions
Biology Online, http://www.biology-online.org/dictionary
(1) The complete set of genes in an organism
(2) The total genetic content in one set of chromosomes
Oxford Reference Dictionary of Biology
All the genes contained in a single set of chromosomes, i.e. in a haploid nucleus
Oxford Reference Dictionary of Genetics
In prokaryotes and eukaryotes, the total DNA in a single chromosome and in a
haploid chromosome set (q.v.), respectively. or all of the genes carried by this
chromosome or chromosome set; in viruses, a single complement of DNA or
RNA
Glossary of Genetic Terms, Genetic Education Center,
Univ. of Kansas Medical Center
All of the genes carried by a single gamete; the DNA content of an individual,
which includes 44 autosomes, 2 sex chromosomes, and the mitochondrial DNA
Glossary, Human Genome Project Information,
(http://www.oml.gov/sci/techresources/Human_
Genome/glossary)
All genetic material in the chromosomes of a particular organism; its size is
generally given as its total number of base pairs
National Human Genome Research Institute Glossary
(http://ghr.nlm.nih.gov/glossary=genome)
The genome is the entire set of genetic instructions found in a cell. In humans,
the genome consists of 23 pairs of chromosomes, found in the nucleus, as well
as a small chromosomes found in the cells’ mitochondria. These chromosomes,
taken together, contain approximately 3.1 billion bases of DNA sequence
National Human Genome Research Institute, Genetic
Home Reference Handbook
(http://ghr.nlm.nih.gov/handbook/hgp/genome)
A genome is an organism’s complete set of DNA, including all of its genes. Each
genome contains all of the information needed to build and maintain that
organism
Genome News Network, Craig Venter Institute
http://www.genomenewsnetwork.org/resources/
whats_a_genome/Chp1_1_1.shtml
A gene is a small piece of the genome. It’s the genetic equivalent of the atom: as
an atom is the fundamental unit of matter, a gene is the fundamental unit of
heredity…
Genome News Network, Craig Venter Institute, Glossary
http://www.genomenewsnetwork.org/resources/glossary/
index.php#g
National Center for Biotechnology Information, Science
Primer, NIH http://www.ncbi.nlm.nih.gov/About/primer/
genetics_genome.html
A genome is all of a living thing’s genetic material. it is the entire set of
hereditary instructions for building, running, and maintaining an organism, and
passing life on to the next generation. The whole shebang
Gene A piece of DNA used by cells to manufacture proteins, which carry out the
business of cells. Each human gene is a template for one or more proteins
Genome A collection of genes. All living things have genomes…A genome
contains contains the biological information for building, running, and
maintaining an organism—and for passing life on to the next generation…A
precise definition of genome is ‘‘all the DNA in a cell’’ because this includes
not only genes but also DNA that is not part of a gene, or non-coding DNA
Life is specified by genomes. Every organism, including humans, has a genome
that contains all of the biological information needed to build and maintain a
living example of that organism. The biological information contained in a
genome is encoded in its deoxyribonucleic add (DNA) and is divided into
discrete units called genes. Genes code for proteins that attach to the genome at
the appropriate positions and switch on a series of reactions called gene
expression
‘‘direct activity’’ as that of coding for proteins, and the
‘‘discourse of gene action’’ was replaced by the central
dogma. Indeed, the Craig Venter Institute’s genome (like
that implied by the Science Primer of the NIH) is still a
‘‘collection of genes’’ carrying ‘‘the entire set of hereditary
instructions for building, running, and maintaining an
organism, and passing life on to the next generation.’’
Despite all the changes our conception of gene has
undergone since the days of Sturtevant, even the most
recent formulations retain the view of genes (and hence of
genomes) as effectively autonomous formal agents, containing ‘‘all of the biological information needed to build
and maintain a living example of that organism,’’ the
blueprint for an organism’s life. But current research in
123
genomics leads to a rather different picture, and it does so
by focusing attention on features that have so far been
missing from our conceptual framework. In addition to
providing information required for building and maintaining an organism, the genome also provides a vast amount
of information for adapting and responding to—for interacting with—the environment in which it finds itself—as
indeed it must if the organism is to develop more or less
normally, and to survive more or less adequately.
Rather than a set of genes initiating causal chains
leading to the formation of traits, I suggest that the genome
that now appears before us is first and foremost an
exquisitely sensitive reaction (or response) mechanism—a
device for regulating the production of specific proteins in
Author's personal copy
Genes, Genomes, and Genomics
response to the constantly changing signals it receives from
its environment. The signals that the genome detects come
most immediately from its intra-cellular environment, but
these reflect, in turn, input from the external environments
both of the cell and of the organism.
This reformulation gives rise to an obvious question:
if the genome is so responsive to its environment, how is it
that the developmental process is as reliable as it is? This is
a question of major importance in biology, and it is rapidly
becoming evident that the answer must be sought not only
in the structural (sequence) stability of the genome, but
also in the relative constancy of the environmental inputs,
and, most importantly, in the dynamic stability of the
system as a whole (see, e.g., Keller 2000). Genomes are
responsive, but far from infinitely so; the range of possible
responses is severely constrained, both by the organizational dynamics of the system in which they are embedded
and by their own structure.
Conclusion: Consequences of Such a Reformulation
Changes in DNA sequences (mutations) deserve the attention we give them because they endure, passed on from one
generation to the next—in a word, inherited. Even if not
themselves genes, they are clearly genetic. Some of these
mutations may affect protein sequences, but far more
commonly, what they alter is the organism’s capacity to
respond effectively to the environment in which the DNA
finds itself, or to respond differentially to altered environments. This conclusion may be especially important in
medical genomics where researchers routinely seek to
correlate the occurrence of disease with sequence variations
in the DNA. Since the sequences thus identified are rarely
located within protein-coding regions of the DNA, the
significance of the correlation must lie elsewhere, i.e., in the
regulatory functions of the associated non-genic DNA.
Mutations also provide the raw material for natural
selection. But when we speak of natural selection as having
programmed the human genome, we should remember that
it is precisely the capacities to respond and adapt for which
natural selection has programmed the human genome.
Unfortunately however, the easy slide from genetics to talk
about genes, with all the causal attributes conventionally
attributed to those entities, makes this an exceedingly hard
lesson to keep hold of.
Finally, this reconceptualization of the genome allows
us—indeed obliges us—to abandon the twin dichotomies—
on the one hand, between genetics and environment, and on
the other, between nature and culture—that have driven so
much unnecessary debate, for so many decades. If much of
what the genome ‘‘does’’ is to respond to signals from
its environment, then the bifurcation of developmental
139
influences into the categories of genetic and environmental
makes no sense. Similarly, if we understand the term
environment to include cultural dynamics (as indeed we
must), neither does the bifurcation of biological and
cultural factors. We have long understood that organisms
interact with their environments, that interactions between
genetics and environment, between biology and culture, are
crucial to making us what we are. What research in
genomics has shown is that biology itself is constituted by
those interactions, and is so constituted at every level, even
at the level of genetics. Indeed, one might say that what
makes a molecule—any molecule—biological is precisely
its capacity to sense and react to its environment. To quote
a recent article on ‘‘Biology as Reactivity’’—i.e., on the
value of viewing biological systems as fundamentally
reactive systems— ‘‘reactive systems ‘live’ … in order to
react’’ (Fisher et al. 2011, p. 73).
That scientific terms point us to the future even while
carrying baggage from the past seems clear. The fact that
this dual role can give rise to tension seems equally clear.
There is, after all, an inherent unpredictability to scientific
inquiry. As Hans-Jorg Rheinberger (1997) has stressed,
experimental systems are mechanisms for generating
surprise and novelty. Sometimes, the surprises generated
by new research demand a turn from past conceptualizations so sharp that the older terms can no longer bear the
tension. The obvious question is: does molecular biology
now find itself at such a crossroads?
References
Barnes B, Dupré J (2008) Genomes and what to make of them.
University of Chicago Press, Chicago
Beurton PJ, Falk R, Rheinberger HJ (eds) (2000) The concept of the
gene in development and evolution. Cambridge University Press,
Cambridge
Cole SW (2009) Social regulation of human gene expression. Curr
Dir Psychol Sci 18(3):132–137
Cole SW, Hawkley LC, Arevalo JM, Sung CY, Rose RM, Cacioppo
JT (2007) Social regulation of gene expression in human
leukocytes. Genome Biol 8:R189
Davis BD (1990) The human genome and other initiatives. Science
249(4967):342–343
Dawkins R (1976) The selfish gene. Oxford University Press,
New York
Doolittle WF, Sapienza C (1980) Selfish genes, the phenotype
paradigm and genome evolution. Nature 284(5757):601–603
Eddy SR (2001) Non-coding RNA genes and the modern RNA world.
Nat Rev Genet 2:919–929
Fisher J, Harel D, Henzinger TA (2011) Biology as reactivity.
Commun ACM 54(10):72–82
Greally JM (2007) Genomics: encyclopaedia of humble DNA. Nature
447:782–783
Keller EF (2000) The century of the gene. Harvard University Press,
Cambridge
Lederberg J, McCray AT (2001) ‘Ome sweet’ omics: a genealogical
treasury of words. Scientist 15(7):8
123
Author's personal copy
140
Mattick JS (2001) Non-coding RNAs: the architects of eukaryotic
complexity. EMBO Rep 2(11):986–991
Mattick JS (2010a) RNA as the substrate for epigenome–environment
interactions. BioEssays 32:548–552
Mattick JS (2010b) Non-coding RNAs in epigenetics (interview).
http://www.epigenie.com/Interviews/John-Mattick-ncRNAs-onthe-Epigenome.html. Accessed 15 Dec 2011
Ohno S (1972) So much ‘‘junk’’ DNA in the genome. In: Smith HH
(ed) Evolution of genetic systems. Brookhaven symposia in
biology, vol 23. Gordon & Breach, New York, pp 366–370
Orgel LE, Crick FH (1980) Selfish DNA: the ultimate parasite. Nature
284(5757):604–607
Pearson H (2007) Genetics: what is a gene? Nature 441:398–401
Polanyi M (1958) Personal knowledge: towards a post-critical
philosophy. University of Chicago Press, Chicago
Rheinberger HJ (1997) Toward a history of epistemic things:
synthesizing proteins in the test tube. Stanford University Press,
Stanford
123
E. F. Keller
Snyder M, Mark Gerstein M (2003) Defining genes in the genomics
era. Science 300(5617):258–260
Srb AM, Owen RD, Edgar RS (1965) General genetics, 2nd edn.
Freeman & Company, New York
Sturtevant AH (1932) The use of mosaics in the study of the
developmental effects of genes. In: Proceedings of the sixth
international congress of genetics, Ithaca, New York, p 304
The ENCODE Project Consortium (2007) Identification and analysis
of functional elements in 1% of the human genome by the
ENCODE pilot project. Nature 447:799–816
Thomas CA Jr (1971) The genetic organization of chromosomes.
Annu Rev Genet 5:237–256
Weinberg RA (1991) There are two large questions. Debate 5:78