Download Genes, Genomes, and Genomics Evelyn Fox Keller

Genes, Genomes, and Genomics Evelyn Fox Keller Biological Theory ISSN 1555-5542 Volume 6 Number 2 Biol Theory (2012) 6:132-140 DOI 10.1007/s13752-012-0014-x 1 23 Your article is protected by copyright and all rights are held exclusively by Konrad Lorenz Institute. This e-offprint is for personal use only and shall not be self-archived in electronic repositories. If you wish to selfarchive your work, please use the accepted author’s version for posting to your own website or your institution’s repository. You may further deposit the accepted author’s version on a funder’s repository at a funder’s request, provided it is not made publicly available until 12 months after publication. 1 23 Author's personal copy Biol Theory (2011) 6:132–140 DOI 10.1007/s13752-012-0014-x LONG ARTICLE Genes, Genomes, and Genomics Evelyn Fox Keller Received: 8 January 2012 / Accepted: 19 January 2012 / Published online: 31 March 2012 Konrad Lorenz Institute for Evolution and Cognition Research 2012 Abstract While scientific terms lack the stability of physical objects, they are generally far more stable than the various meanings associated with them. As a consequence, they tend to carry older conceptions alongside those more recently acquired, thereby exerting an effective drag against conceptual change. I illustrate this claim with an analysis of the shifting meanings of the term genome, originally used to refer to a collectivity of genes, but more recently to an organism’s complement of DNA. While genes were originally regarded as effectively autonomous formal agents, and DNA as collections of genes, contemporary research suggests that an organism’s DNA constitutes a far more complex system designed to adapt and respond to the environment in which it finds itself. Keywords DNA Genes Genome ncRNA Reactive system Regulation It has been frequently observed that, unlike DNA, genes are not physical objects but ‘‘merely concepts’’—things that have no fixed existence, and about which there can be no single fact of the matter. Indeed, the multiple and shifting meanings of this term have been the subject of extensive discussion by biologists as well as by historians and philosophers of biology (see, e.g., Beurton et al. 2000; Keller 2000; Snyder and Mark Gerstein 2003; Pearson 2007). But what is less likely to be noted is that the word itself has demonstrated remarkable endurance since its coining in 1909. To be sure, it lacks the stability of a E. F. Keller (&) Program in Science Technology and Society, Massachusetts Institute of Technology, Cambridge, MA, USA e-mail: [email protected] 123 physical object, but my point here is that it has exhibited far greater stability than the meanings associated with it. I argue that the endurance of words, especially in the face of shifting meanings, has important consequences for how we think about the phenomena these words were intended to describe. The focus of the present essay is not however on the term gene, but rather on the genome, a term introduced only 11 years after gene. I will use an exploration of the changing meanings of genome to illustrate a number of general observations about the role of language in scientific thought, and more specifically, about the role of words as vehicles of resistance to conceptual change. These observations—developed by a variety of authors but perhaps most notably by Polanyi (1958)—can be formulated as follows: (1) (2) 1 Scientific terms (words) often endure for long periods, but because the meanings of these terms depend on the particular context of research, they undergo constant change. As Michael Polanyi long ago observed,1 such changes in meaning occur for the most part outside of ‘‘focal awareness.’’ For this reason, they tend to evade explicit marking, allowing older meanings to persist alongside newer meanings, even as a term is used by a single author; indeed, even in the same text. Polanyi also observed that ‘‘Different languages … sustain alternative conceptual frameworks, interpreting all things that can be talked about in terms of somewhat different allegedly recurrent features’’ (ibid). The point I wish to stress, however, is that, just as different He wrote: ‘‘The meaning of speech thus keeps changing in the act of groping for words without our ever being focally aware of the change, and our groping invests words in this manner with a fund of unspecifiable connotations’’ (1958, p. 112). Author's personal copy Genes, Genomes, and Genomics (3) languages sustain different conceptual frameworks, so too do different meanings of the same terms. In effect, different meanings of scientific terms constitute different scientific languages in and of themselves. Through the persistence of older meanings, words carry past conceptual frameworks into the present and future, sustaining the expectations and formulations that had originally generated these earlier meanings and infusing them into interpretations of new data. In so doing, they constitute vehicles of resistance to conceptual change, imposing an earlier generation’s ‘‘particular theory of the nature of things’’2 onto a current generation’s perspective. What is a Genome? At least four different meanings are commonly invoked in definitions of the genome, two of them inherited from the era of classical genetics (one referring to genes and the other to chromosomes), the third following the characterization of DNA as the material basis of inheritance, and a fourth which at least tacitly opens a question about the relation between genes and genetic. Of particular interest here will be the relation between genomes and genes. Given its history, it is inevitable that our understanding of what a genome is has been (and remains) deeply entwined with current and prior conceptions of the gene. Furthermore, given the variability of those conceptions, the relation between the two terms will inevitably prove correspondingly unstable. Yet, however variable, I will argue that that relation is itself of interest; it is also of consequence—not only for how we talk, but also for how we think. Especially, I suggest that the view—in effect, the default view—of genomes as collections of genes (however these are conceived) constrains our ability to conceptually adapt to many of the more radical findings of genomic research. Recent work has posed acute problems for so simple an understanding of the relation between genes and genomes, but of a kind that are often masked by the fluidity of usage of the former term, coupled with its persistent conjunction with the latter. Usage and Meanings of the Term Genome, 1920–1990 Like others, Joshua Lederberg and Alexa McCray (2001) attribute the coining of the term to Hans Winkler who, in 2 As Polanyi put it, ‘‘In learning to speak, every child accepts a culture constructed on the premises of the traditional interpretation of the universe, rooted in the idiom of the group to which it was born, and every intellectual effort of the educated mind will be made within this frame of reference’’ (ibid). 133 1920, combined the word gene with the part of the word chromosome (–ome) that signifies the collectivity of units to form a new word. The new word, Winkler wrote, refers to ‘‘the haploid chromosome set, which, together with the pertinent protoplasm, specifies the material foundations of the species’’ (quoted in Lederberg and McCray 2001, p. 8). It was not much used until the early to mid 1960s (see Fig. 1), but when it was employed, it was generally taken (often without definition) as referring simultaneously to an organism’s complement of genes and to its defining set of chromosomes, the tacit assumption being that the two were equivalent—i.e., as if the chromosomes consisted of nothing (or at least nothing of significance) other than the complement of genes. Thus, e.g., in their widely used genetics textbook, Edgar Srb and his colleagues write: ‘‘Among organisms with chromosomes, each species has a characteristic set of genes, or genome. In diploids a genome is found in each normal gamete. It consists of a full set of the different kinds of chromosomes’’ (1965, p. 190). Indeed, it might be argued that the distinction between a complement of genes (whatever they might be) and a full set of chromosomes was not experimentally accessible in the context of classical genetics. It might even be argued that it was correspondingly inaccessible conceptually; certainly, history suggests that it was not (or at least not generally) conceptually accessed. In later years, however—and especially after the advent of molecular biology—that distinction did become available to both conceptual and experimental scrutiny. As techniques for distinguishing between DNA and the various lipids and proteins that contribute to the composition of chromosomes developed, the conceptual difference between genes and chromosomes inevitably became more conspicuous as well. In retrospect, one might think that it would accordingly come to demand explicit marking. Apparently not so, but I at least want to do so, referring to the complement of genes as genome-1, and of chromosomes, as genome-2. By the 1960s, after DNA had been established as providing the material basis of genetics, and after biologists began to shift their conception of genes from bits of chromosomes to bits of DNA, the genome also acquired a new (and third) meaning: an organism’s complete set of DNA. Let me call this genome-3. As can be seen from Fig. 1, overall use of the term rises significantly after 1960. Figure 2 shows its usage growing steadily (albeit not yet dramatically) throughout the 1970s and 1980s. Genes, Genomes and Junk DNA The interpretation of genomes in terms of DNA comes increasingly into play during this period, even while earlier 123 Author's personal copy 134 E. F. Keller Fig. 1 Number of articles referring to the term genome in the biological literature, 1934–1970 (from the Web of Science) Fig. 2 Number of articles referring to genome in the biological literature, 1971–1990 (from the Web of Science) definitions continue to persist. Furthermore, and adding yet more complication, research in molecular biology soon led to yet another definition, namely the genome as ‘‘all of a living thing’s genetic material.’’ I call this genome-4. The difference between genes and genetic material (information/instructions)—i.e., between genome-1 and genome4—might not have been meaningful in the days when genetics referred simply to the science of genes. But from the 1970s on, especially as the focus of molecular genetics shifted to the study of eukaryotic organisms, and as the study of regulation assumed increasing centrality to that science, the relation between genes and genetics has come to seem less straightforward. To the extent that regulation is a 123 property of DNA, it is surely genetic, but is it always attributable to genes?3 A somewhat different challenge to the equivalence between genes and genetic material might also have been provoked by a series of findings, beginning in the late 1960s, prompting the recognition of substantial expanses of nonprotein-coding (‘‘nongenic’’ or ‘‘extra’’) DNA sequences in eukaryotic genomes. Perhaps most important were the findings (1) of large amounts of repetitive DNA in 1968; (2) of the wildly varying relationship between the amount of DNA in an organism and its complexity (the ‘‘C-value paradox’’; Thomas 1971); and (3) of split genes (protein-coding sequences interrupted by non-coding ‘‘introns’’) in 1977. But the designation of such DNA as ‘‘junk’’—a term first introduced by Susumu Ohno (1972) to refer to the rampant degeneracy (nongenic and, as he thought, non-transcribed sequences) in eukaryotic DNA—soon blunted that challenge. After the appearance in 1980 of two back-to-back papers (Doolittle and Sapienza 1980; Orgel and Crick 1980) that linked ‘‘junk’’ DNA to Richard Dawkins’ (1976) notion of ‘‘selfish’’ DNA, the idea of ‘‘junk DNA’’ seemed to become entrenched. Orgel and Crick begin by writing (1980, p. 604): A piece of selfish DNA, in its purest form, has two distinct properties: (1) (2) 3 It arises when a DNA sequence spreads by forming additional copies of itself within the genome. It makes no specific contribution to the phenotype. Although not directly germane to the main theme of this essay, it should be noted that a need to distinguish between genome-3 and genome-4 has also been stimulated by current research, especially by the growing appreciation of alternative forms of inheritance (and hence, of alternate meanings of genetic) that arises out of contemporary work on epigenetics. Author's personal copy Genes, Genomes, and Genomics But the authors want to use the term in a wider sense, so that it can refer not only to obviously repetitive DNA but also to certain other DNA sequences that appear to have little or no function, such as much of the DNA in the introns of genes and parts of the DNA sequences between genes …. The conviction has been growing that much of this extra DNA is ‘‘junk,’’ in other words, that it has little specificity and conveys little or no selective advantage to the organism … in the case of selfish DNA, the sequence which spreads makes no contribution to the phenotype of the organism, except insofar as it is a slight burden to the cell that contains it. (1980, p. 604) Until the early 1990s, the assumption that the large amounts of non-coding DNA found in eukaryotic organisms had ‘‘little or no function,’’ contributed nothing to their phenotype, and could therefore be ignored, remained relatively uncontested. For all practical purposes, genomes (or at least the interesting parts of genomes) could still be thought of as collections of genes. Indeed, when the Human Genome Project (HGP) first announced its intention to sequence the entire human genome, much of the opposition to that proposal was premised on this assumption. Thus, e.g., Bernard Davis of the Harvard Medical School complained that ‘‘blind sequencing of the genome can also lead to the discovery of new genes …, but this would not be an efficient process. On average, it would be necessary to plow through 1–2 million ‘junk’ bases before encountering an interesting sequence’’ (Davis 1990, p. 343). And in a similar vein, Robert Weinberg of MIT argued: The sticky issue arises at the next stage of the project, its second goal, which will involve determining the entire DNA sequence of our own genome and those of several others. Here one might indeed raise questions, as it is not obvious how useful most of this information will be to anyone. This issue arises because upwards of 95 % of our genome contains sequence blocks that seem to carry little if any biological information. As is well known, our genome is riddled with vast numbers of repetitive sequences, pseudogenes, introns, and intergenic segments. These DNA segments all evolve rapidly, apparently because their sequence content has little or no effect on phenotype. Some of the sequence information contained in them may be of interest to those studying the recent and distant evolutionary precursors of our modern genome. But in large part, this vast genetic desert holds little promise of yielding many gems. As more and more genes are isolated and sequenced, the 135 arguments that this junk DNA will yield great surprises become less and less persuasive. (Weinberg 1991, p. 78) Weinberg was soon proven wrong. In the mid-1990s, and largely as a result of research enabled by large-scale DNA sequencing, confidence that non-coding DNA is nonfunctional, that it could be regarded as junk, began an ever more rapid decline; and by the dawn of the new century, few authors (not even Weinberg) still saw the equation between non-coding DNA and junk as viable. Genes, Genomes, and Genomics, 1990 to the Present The launching of the HGP in 1990 was almost certainly the most significant moment in the entire history of the term genome. From that point on, usage of the term explodes, jumping from 587 references per year in 1989 to 871 in 1990, and 3,777 in 1991. By 2002, the number passes 10,000, and by 2010, exceeds 21,000 (see Fig. 3). The numbers displayed in Fig. 3 are telling. In particular, they point to a critical transformation in the history of genetics that was in good part triggered by the HGP. The explosive growth over these last two decades in use of the term genome in the professional literature reflects not only the rise of a new science of genomics, but also, the growing incompatibilities between its concepts, methods, and ontologies and those of the earlier science of genetics. As Barry Barnes and John Dupré describe that transition in their recent book, Genomes and What to Make of Them, it ‘‘involves genomes rather than genes being treated as real, Fig. 3 Number of articles referring to genome in the biological literature, 1989–2010 (from the Web of Science) 123 Author's personal copy 136 and systems of interacting molecules rather than sets of discrete particles becoming the assumed underlying objects of research’’ (2008, p. 8). Yet genes have hardly disappeared from our thinking about genomes. To be sure, it has become difficult to find a definition of the gene upon which researchers can agree, and slippage between different understandings of the term has become endemic. Nevertheless, gene talk remains prominent in discussions of genomes, and what one means by the former term inevitably effects how one understands what the genome is, and about what it does. There does however seem to be a default response that, despite the variability in definitions of the gene, continues routinely to be offered whenever a definition is demanded: the gene is a protein-coding sequence. And indeed, it is precisely in relation to this default definition that research in molecular genomics has proven such a challenge. I briefly referred to this challenge in my earlier discussion of the rise and fall of the concept of junk DNA, there suggesting that the demise of that concept after the mid-1990s can be directly attributed to the growing impact of genomics research. Here I want to elaborate on that suggestion. Of particular importance were the findings, first, that much if not most of the non-coding DNA in eukaryotic organisms is in fact transcribed; second, that a surprising range of regulatory functions (by 2001, these include RNA interference, co-suppression, transgene silencing, imprinting, and methylation) can be attributed to what by this time had come to be called non-coding RNA (ncRNA); and third, that the human genome contains far fewer genes than had been anticipated. To some, these findings called for a radical reconceptualization of the nature of genetic regulation, and especially, of the role of RNA in the architecture of complexity in higher organisms; others focused on the unduly restrictive definition of genes as protein coding sequences of DNA,4 and sought to cast ncRNA sequences as other kinds of ‘‘genes’’ (see, e.g., Eddy 2001). In any case, all of these developments clearly challenged earlier equations between protein coding sequences and function (or more precisely, between non-coding sequences and non-function). In 2003, a new public research consortium was launched by the US National Human Genome Research Institute (NHGRI) with the explicit goal of finding all the functional elements in the human genome. It was called ENCODE (the ENCyclopedia Of DNA Elements), and the results of 4 According to John Mattick, e.g., ‘‘The failure to recognize the possible significance of these RNAs is based on the central dogma, as determined from bacterial molecular genetics, that genes are synonymous with proteins, and that RNAs are just temporary reflections of this information’’ (2001, p. 987). 123 E. F. Keller the first phase—the detailed analysis of 1 % of the human genome—were reported in The ENCODE Project Consortium (2007). They confirmed that the human genome is ‘‘pervasively transcribed’’ (apparently in a developmentally regulated way); that most of the transcripts are non-coding; that, on the one hand, regulatory sequences may overlap protein coding sequences; on the other hand, they are also often far removed from coding sequences. Perhaps the biggest surprise was that functional sequences need not be under evolutionary constraint. The reaction was swift. In his ‘‘News and Views’’ commentary on the report, John Greally wrote, We usually think of the functional sequences in the genome solely in terms of genes, the sequences transcribed to messenger RNA to generate proteins. This perception is really the result of effective publicity by the genes, who take all of the credit … They have even managed to have the entire DNA sequence referred to as the ‘genome’, as if the collective importance of genes is all you need to know about the DNA in a cell … Even this preliminary study reveals that the genome is much more than a mere vehicle for genes, and sheds light on the extensive molecular decision-making that takes place before a gene is expressed. (2007, p. 783) Over the last few years, surprises about the range and extent of ncRNA involvement in molecular decision making have only accelerated. John Mattick (2010b) puts their role in the modulation of chromatin architecture and epigenetic memory at the top of the list. These are processes crucial not only to embryogenesis but also to brain development, plasticity, learning, and, more generally, to regulating the impact of environmental signals. NcRNA transcripts have been shown to be critical to the regulation not only of transcription (both cis and trans driven), but also of alternative splicing, of chromosome dynamics, and of epigenetic memory. Finally, and perhaps most revolutionary, are the findings that implicate ncRNA in the editing of RNA transcripts, thereby modulating the configuration of the regulatory networks these transcripts form. Behavioral (or physiological) adaptation does not require that environments can directly alter DNA sequence: environmental signals trigger a wide range of signal transduction cascades that lead to short-term adaptation. But mechanisms for editing RNA sequences that in turn regulate gene expression may provide the most dramatic case yet for environmental adaptation. Moreover, epigenetic mechanisms (also mediated by ncRNA) begin to dissolve the distinction between short-term and long-term adaptations by lending to such adaptations the possibility of being transmitted inter-generationally. As Mattick explains, ‘‘the ability to edit RNA … suggests that not only Author's personal copy Genes, Genomes, and Genomics proteins but also—and perhaps more importantly—regulatory sequences can be modulated in response to external signals and that this information may feedback via RNAdirected chromatin modifications into epigenetic memory’’ (2010a, p. 551). Finally, environmental signals are not restricted to the simple physical and chemical stimuli that directly impinge on the skin: organisms with central nervous systems have receptors for forms of perception that are both far more complex and longer range. Humans have especially sophisticated perceptual capacities, enabling them to respond to a wide range of complex visual, auditory, linguistic, and behavioral/emotional signals in their extended environment. Apparently, as researchers have recently begun to demonstrate, responses to such signals can extend all the way down to the level of gene expression. For instance, in 2007 Steve Cole and his colleagues compared the gene expression patterns in the leukocytes of those who felt socially isolated with the expression patterns of those who felt connected to others, and were able to demonstrate systematic difference in the expression of roughly 1 % of the genes assayed (Cole et al. 2007). Subsequent studies have correlated gene expression patterns with other social indicators (e.g., socioeconomic status), providing further evidence for mechanisms of ‘‘social signal transduction’’ that reach all the way down to the level of the genome. Cole explains that, ‘‘Socio-environmental processes regulate human gene expression by activating central nervous system processes that subsequently influence hormone and neurotransmitter activity in the periphery of the body. Peripheral signaling molecules interact with cellular receptors to activate transcription factors’’ (2009, p. 133). 137 if the meaning of genetic material (information or instructions) could still be reduced to a set of discrete and autonomous units called genes. Indeed, a review of the major current scientific glossaries suggests that all four meanings of the term continue in common usage to the present day, in a pattern of coexistence that can be readily observed from the list of definitions shown in Table 1. (It should be noted that the glossaries cited are intended for the larger professional and non-professional community; they are organized in an effort to educate and inform nontechnical readers of the meanings of these terms as employed in the technical literature.) Thus, for example, the Oxford Dictionary of Genetics happily defines the genome as ‘‘the total DNA’’ in a chromosome set [genome-3] or ‘‘all of the genes carried by … this chromosome set’’; the Craig Venter Institute defines it, alternately, as ‘‘all of a living thing’s genetic material,’’ ‘‘the entire set of hereditary instructions for building, running, and maintaining an organism, and passing life on to the next generation’’ (genome-4), ‘‘A collection of genes’’ (genome-1), and ‘‘all the DNA in a cell’’ (genome3). The Science Primer provided by the NIH similarly confounds these various definitions, telling us that a genome contains all of the biological information needed to build and maintain a living example of that organism [genome-4]. The biological information contained in a genome is encoded in its deoxyribonucleic acid (DNA) [genome-3] and is divided into discrete units called genes [genome-1]. Genes code for proteins that attach to the genome at the appropriate positions and switch on a series of reactions called gene expression. What is a Genome Today? Old and New Conceptual Frameworks As experts in the field now generally recognize, the gap between a full complement of protein-coding sequences and the genetic material (or DNA) of an organism is in fact huge: e.g., only 1.2 % of human DNA is currently estimated to be devoted to protein-coding sequences. Yet even in the face of so glaring a disparity, even with the growing recognition of how complex and multi-layered are the regulatory processes enabled by the genome, that entity continues to be understood as a collection of genes that are themselves impervious to environmental input. To be sure, other definitions of the genome have been added since the term was first introduced (genome-3 and genome-4), but neither of these has quite replaced the original definition (genome-1). Instead, the more recent definitions appear to have simply adjoined to the others, as if the genome can still be understood as indiscriminately referring to the organism’s totality of genes and its totality of DNA, and as What is most immediately striking here is not only the persistence of older usages of the term, maintained alongside meanings corresponding to more recent experimental practices, but also, through that persistence, the perpetuation of conceptual frameworks to which the older meanings were attached. Thus, e.g., an understanding of the genome as an organism’s totality of genes recalls the classical discourse of gene action and Alfred H. Sturtevant’s reformulation of embryology’s question of how an egg develops into a complex many-celled organism as one of ‘‘how genes produce their effects.’’ As Sturtevant wrote, ‘‘in most cases there is a chain of reaction between the direct activity of a gene and the end-product that the geneticist deals with as a character’’ (1932, p. 307). Molecular biology succeeded in unpacking that ‘‘chain of reaction’’: genes were identified as DNA sequences, their 123 Author's personal copy 138 E. F. Keller Table 1 Definitions of genome obtained from a range of scientific glossaries (both official and semi-official) Source Definitions Biology Online, http://www.biology-online.org/dictionary (1) The complete set of genes in an organism (2) The total genetic content in one set of chromosomes Oxford Reference Dictionary of Biology All the genes contained in a single set of chromosomes, i.e. in a haploid nucleus Oxford Reference Dictionary of Genetics In prokaryotes and eukaryotes, the total DNA in a single chromosome and in a haploid chromosome set (q.v.), respectively. or all of the genes carried by this chromosome or chromosome set; in viruses, a single complement of DNA or RNA Glossary of Genetic Terms, Genetic Education Center, Univ. of Kansas Medical Center All of the genes carried by a single gamete; the DNA content of an individual, which includes 44 autosomes, 2 sex chromosomes, and the mitochondrial DNA Glossary, Human Genome Project Information, (http://www.oml.gov/sci/techresources/Human_ Genome/glossary) All genetic material in the chromosomes of a particular organism; its size is generally given as its total number of base pairs National Human Genome Research Institute Glossary (http://ghr.nlm.nih.gov/glossary=genome) The genome is the entire set of genetic instructions found in a cell. In humans, the genome consists of 23 pairs of chromosomes, found in the nucleus, as well as a small chromosomes found in the cells’ mitochondria. These chromosomes, taken together, contain approximately 3.1 billion bases of DNA sequence National Human Genome Research Institute, Genetic Home Reference Handbook (http://ghr.nlm.nih.gov/handbook/hgp/genome) A genome is an organism’s complete set of DNA, including all of its genes. Each genome contains all of the information needed to build and maintain that organism Genome News Network, Craig Venter Institute http://www.genomenewsnetwork.org/resources/ whats_a_genome/Chp1_1_1.shtml A gene is a small piece of the genome. It’s the genetic equivalent of the atom: as an atom is the fundamental unit of matter, a gene is the fundamental unit of heredity… Genome News Network, Craig Venter Institute, Glossary http://www.genomenewsnetwork.org/resources/glossary/ index.php#g National Center for Biotechnology Information, Science Primer, NIH http://www.ncbi.nlm.nih.gov/About/primer/ genetics_genome.html A genome is all of a living thing’s genetic material. it is the entire set of hereditary instructions for building, running, and maintaining an organism, and passing life on to the next generation. The whole shebang Gene A piece of DNA used by cells to manufacture proteins, which carry out the business of cells. Each human gene is a template for one or more proteins Genome A collection of genes. All living things have genomes…A genome contains contains the biological information for building, running, and maintaining an organism—and for passing life on to the next generation…A precise definition of genome is ‘‘all the DNA in a cell’’ because this includes not only genes but also DNA that is not part of a gene, or non-coding DNA Life is specified by genomes. Every organism, including humans, has a genome that contains all of the biological information needed to build and maintain a living example of that organism. The biological information contained in a genome is encoded in its deoxyribonucleic add (DNA) and is divided into discrete units called genes. Genes code for proteins that attach to the genome at the appropriate positions and switch on a series of reactions called gene expression ‘‘direct activity’’ as that of coding for proteins, and the ‘‘discourse of gene action’’ was replaced by the central dogma. Indeed, the Craig Venter Institute’s genome (like that implied by the Science Primer of the NIH) is still a ‘‘collection of genes’’ carrying ‘‘the entire set of hereditary instructions for building, running, and maintaining an organism, and passing life on to the next generation.’’ Despite all the changes our conception of gene has undergone since the days of Sturtevant, even the most recent formulations retain the view of genes (and hence of genomes) as effectively autonomous formal agents, containing ‘‘all of the biological information needed to build and maintain a living example of that organism,’’ the blueprint for an organism’s life. But current research in 123 genomics leads to a rather different picture, and it does so by focusing attention on features that have so far been missing from our conceptual framework. In addition to providing information required for building and maintaining an organism, the genome also provides a vast amount of information for adapting and responding to—for interacting with—the environment in which it finds itself—as indeed it must if the organism is to develop more or less normally, and to survive more or less adequately. Rather than a set of genes initiating causal chains leading to the formation of traits, I suggest that the genome that now appears before us is first and foremost an exquisitely sensitive reaction (or response) mechanism—a device for regulating the production of specific proteins in Author's personal copy Genes, Genomes, and Genomics response to the constantly changing signals it receives from its environment. The signals that the genome detects come most immediately from its intra-cellular environment, but these reflect, in turn, input from the external environments both of the cell and of the organism. This reformulation gives rise to an obvious question: if the genome is so responsive to its environment, how is it that the developmental process is as reliable as it is? This is a question of major importance in biology, and it is rapidly becoming evident that the answer must be sought not only in the structural (sequence) stability of the genome, but also in the relative constancy of the environmental inputs, and, most importantly, in the dynamic stability of the system as a whole (see, e.g., Keller 2000). Genomes are responsive, but far from infinitely so; the range of possible responses is severely constrained, both by the organizational dynamics of the system in which they are embedded and by their own structure. Conclusion: Consequences of Such a Reformulation Changes in DNA sequences (mutations) deserve the attention we give them because they endure, passed on from one generation to the next—in a word, inherited. Even if not themselves genes, they are clearly genetic. Some of these mutations may affect protein sequences, but far more commonly, what they alter is the organism’s capacity to respond effectively to the environment in which the DNA finds itself, or to respond differentially to altered environments. This conclusion may be especially important in medical genomics where researchers routinely seek to correlate the occurrence of disease with sequence variations in the DNA. Since the sequences thus identified are rarely located within protein-coding regions of the DNA, the significance of the correlation must lie elsewhere, i.e., in the regulatory functions of the associated non-genic DNA. Mutations also provide the raw material for natural selection. But when we speak of natural selection as having programmed the human genome, we should remember that it is precisely the capacities to respond and adapt for which natural selection has programmed the human genome. Unfortunately however, the easy slide from genetics to talk about genes, with all the causal attributes conventionally attributed to those entities, makes this an exceedingly hard lesson to keep hold of. Finally, this reconceptualization of the genome allows us—indeed obliges us—to abandon the twin dichotomies— on the one hand, between genetics and environment, and on the other, between nature and culture—that have driven so much unnecessary debate, for so many decades. If much of what the genome ‘‘does’’ is to respond to signals from its environment, then the bifurcation of developmental 139 influences into the categories of genetic and environmental makes no sense. Similarly, if we understand the term environment to include cultural dynamics (as indeed we must), neither does the bifurcation of biological and cultural factors. We have long understood that organisms interact with their environments, that interactions between genetics and environment, between biology and culture, are crucial to making us what we are. What research in genomics has shown is that biology itself is constituted by those interactions, and is so constituted at every level, even at the level of genetics. Indeed, one might say that what makes a molecule—any molecule—biological is precisely its capacity to sense and react to its environment. To quote a recent article on ‘‘Biology as Reactivity’’—i.e., on the value of viewing biological systems as fundamentally reactive systems— ‘‘reactive systems ‘live’ … in order to react’’ (Fisher et al. 2011, p. 73). That scientific terms point us to the future even while carrying baggage from the past seems clear. The fact that this dual role can give rise to tension seems equally clear. There is, after all, an inherent unpredictability to scientific inquiry. As Hans-Jorg Rheinberger (1997) has stressed, experimental systems are mechanisms for generating surprise and novelty. Sometimes, the surprises generated by new research demand a turn from past conceptualizations so sharp that the older terms can no longer bear the tension. The obvious question is: does molecular biology now find itself at such a crossroads? References Barnes B, Dupré J (2008) Genomes and what to make of them. University of Chicago Press, Chicago Beurton PJ, Falk R, Rheinberger HJ (eds) (2000) The concept of the gene in development and evolution. Cambridge University Press, Cambridge Cole SW (2009) Social regulation of human gene expression. Curr Dir Psychol Sci 18(3):132–137 Cole SW, Hawkley LC, Arevalo JM, Sung CY, Rose RM, Cacioppo JT (2007) Social regulation of gene expression in human leukocytes. Genome Biol 8:R189 Davis BD (1990) The human genome and other initiatives. Science 249(4967):342–343 Dawkins R (1976) The selfish gene. Oxford University Press, New York Doolittle WF, Sapienza C (1980) Selfish genes, the phenotype paradigm and genome evolution. Nature 284(5757):601–603 Eddy SR (2001) Non-coding RNA genes and the modern RNA world. Nat Rev Genet 2:919–929 Fisher J, Harel D, Henzinger TA (2011) Biology as reactivity. Commun ACM 54(10):72–82 Greally JM (2007) Genomics: encyclopaedia of humble DNA. Nature 447:782–783 Keller EF (2000) The century of the gene. Harvard University Press, Cambridge Lederberg J, McCray AT (2001) ‘Ome sweet’ omics: a genealogical treasury of words. Scientist 15(7):8 123 Author's personal copy 140 Mattick JS (2001) Non-coding RNAs: the architects of eukaryotic complexity. EMBO Rep 2(11):986–991 Mattick JS (2010a) RNA as the substrate for epigenome–environment interactions. BioEssays 32:548–552 Mattick JS (2010b) Non-coding RNAs in epigenetics (interview). http://www.epigenie.com/Interviews/John-Mattick-ncRNAs-onthe-Epigenome.html. Accessed 15 Dec 2011 Ohno S (1972) So much ‘‘junk’’ DNA in the genome. In: Smith HH (ed) Evolution of genetic systems. Brookhaven symposia in biology, vol 23. Gordon & Breach, New York, pp 366–370 Orgel LE, Crick FH (1980) Selfish DNA: the ultimate parasite. Nature 284(5757):604–607 Pearson H (2007) Genetics: what is a gene? Nature 441:398–401 Polanyi M (1958) Personal knowledge: towards a post-critical philosophy. University of Chicago Press, Chicago Rheinberger HJ (1997) Toward a history of epistemic things: synthesizing proteins in the test tube. Stanford University Press, Stanford 123 E. F. Keller Snyder M, Mark Gerstein M (2003) Defining genes in the genomics era. Science 300(5617):258–260 Srb AM, Owen RD, Edgar RS (1965) General genetics, 2nd edn. Freeman & Company, New York Sturtevant AH (1932) The use of mosaics in the study of the developmental effects of genes. In: Proceedings of the sixth international congress of genetics, Ithaca, New York, p 304 The ENCODE Project Consortium (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799–816 Thomas CA Jr (1971) The genetic organization of chromosomes. Annu Rev Genet 5:237–256 Weinberg RA (1991) There are two large questions. Debate 5:78

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Genes, Genomes, and Genomics Evelyn Fox Keller